Dimension/Enums

Some statistics are available on a more granular level. This is sometimes referred to as slicing the data (cube) across dimensions or as splitting it according to enums. This package sticks to the latter.

An enumeration, or enum, “is a a set of symbolic names (members) bound to unique,

constant values” ([python enums](https://docs.python.org/3/library/enum.html)).

It is stricter than a dictionary as it ensures e.g. uniquness.

In order to extract more granular data one needs to add two lines to the what constitutes a minimal example.

from datenguidepy.query_builder import Query

q = Query.region('01')
stat = q.add_field('BEVSTD')
stat.add_field('GES') # add gender column to the output
stat.add_args({'GES':'ALL'}) # request all genders (pluts total)
q.results().head(6).iloc[:,:5]

id

name

GES

year

BEVSTD

01

Schleswig-Holstein

GESM

1995

1330257

01

Schleswig-Holstein

GESW

1995

1395204

01

Schleswig-Holstein

1995

2725461

01

Schleswig-Holstein

GESM

1996

1339326

01

Schleswig-Holstein

GESW

1996

1402967

01

Schleswig-Holstein

1996

2742293

The first added line only causes the gender information to be added to the output. This would affect the results by itself, because the package always provides totals accross the enums by default. The Second line changes that and tells pydatenguide to request all differnt gender data individually in addition to the total. This line by itself would also not suffice, as the user could not distinguish the different results without adding the gender column:

from datenguidepy.query_builder import Query

q = Query.region('01')
stat = q.add_field('BEVSTD')
# comment out adding the enums:
# stat.add_field('GES') # add gender column to the output
stat.add_args({'GES':'ALL'}) # request all genders (pluts total)
q.results().head(6).iloc[:,:5]

id

name

year

BEVSTD

BEVSTD_source_title_de

01

Schleswig-Holstein

1995

1330257

Fortschreibung des Bevölkerungsstandes

01

Schleswig-Holstein

1995

1395204

Fortschreibung des Bevölkerungsstandes

01

Schleswig-Holstein

1995

2725461

Fortschreibung des Bevölkerungsstandes

01

Schleswig-Holstein

1996

1339326

Fortschreibung des Bevölkerungsstandes

01

Schleswig-Holstein

1996

1402967

Fortschreibung des Bevölkerungsstandes

01

Schleswig-Holstein

1996

2742293

Fortschreibung des Bevölkerungsstandes

The 'ALL' argument works for all enums and is usually the most convenient way to obtain the information. Alternatively enum information can be requested specifically by specifying the particular member of the enum. In the case of gender the members are 'GESM' and 'GESW'.

In order to figure out which enums exist for a particular statistic and which members they have, the .get_info method can be used for the statistic. In this case that would mean calling stat.get_info(), which will print detailed information about the statistic on the screen inlcuding its enums. Note that enum names can be empty in the database, in case they are define but not populated.

stat.get_info() returns (shortend):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
kind:
OBJECT

description:
Bevölkerungsstand

arguments:
year: LIST of type SCALAR(Int)

statistics: LIST of type ENUM(BEVSTDStatistics)
enum values:
R12411: Fortschreibung des Bevölkerungsstandes
R32211: Erhebung der öffentlichen Wasserversorgung

ALTX75: LIST of type ENUM(ALTX75)
enum values:
ALT000: unter 1 Jahr
...
ALT085UM: 85 Jahre und mehr
GESAMT: Gesamt

GES: LIST of type ENUM(GES)
enum values:
GESM: männlich
GESW: weiblich
GESAMT: Gesamt

ALTX21: LIST of type ENUM(ALTX21)
enum values:
ALT000B03: unter 3 Jahre
...
ALT090UM: 90 Jahre und mehr
GESAMT: Gesamt

NAT: LIST of type ENUM(NAT)
enum values:
NATA: Ausländer(innen)
NATD: Deutsche
GESAMT: Gesamt

ALTX76: LIST of type ENUM(ALTX76)
enum values:
ALT000: unter 1 Jahr
...
ALT090UM: 90 Jahre und mehr
GESAMT: Gesamt

ALTX20: LIST of type ENUM(ALTX20)
enum values:
ALT000B03: unter 3 Jahre
...
ALT075UM: 75 Jahre und mehr
GESAMT: Gesamt

filter: INPUT_OBJECT(BEVSTDFilter)

fields:
id: Interne eindeutige ID
year: Jahr des Stichtages
value: Wert
source: Quellenverweis zur GENESIS Regionaldatenbank
ALTX75: Altersjahre (unter 1 bis 75, Altersgruppen)
GES: Geschlecht
ALTX21: Altersgruppen (unter 3, 5er-Schritte, 90 und mehr)
NAT: Nationalität
ALTX76: Altersjahre (unter 1 bis 90, Altersgruppen)
ALTX20: Altersgruppen (unter 3 bis 75 u. m.)

enum values:
None

See line 22-26 for our previous discussed example.

One last variation to summarize of our example:

from datenguidepy.query_builder import Query

id

name

GES

year

BEVSTD

01

Schleswig-Holstein

GESAMT

1995

2725461

01

Schleswig-Holstein

GESAMT

1996

2742293

01

Schleswig-Holstein

GESAMT

1997

2756473

01

Schleswig-Holstein

GESAMT

1998

2766057

01

Schleswig-Holstein

GESAMT

1998

2766057

01

Schleswig-Holstein

GESAMT

1999

2777275