terrykrohe

terrykrohe OP t1_j44xmww wrote

comments w/r AR, LA, NY, NJ metrics

Purpose
– the "forest": Previous posts (summary post, 14Apr2022) did not identify the individual states. The overall non-random, top/bottom, Rep/Dem differentiation was the point. Curiously, this differentiation persisted within the Rep/Dem state groupings – for example, as Rep states' R:D vote ratio increased, their infant mortality increased and their suicide rates increased (see posts 07Apr, 21Apr)

.– the "trees": This post presents four individual states and their metrics for comparison.
– z-scores are used so that dimensionless comparison can be done
– note: a negative z-score of a negative metric is considered positive

Comparing and Contrasting the four states ...
i) two Dem states and two Rep states: ranking the states, NY and NJ are at the top end and AR and LA are in the bottom end
ii) the z-scores show why: NY and NJ have large (+) values for positive metrics and large (–) values for negative metrics
iii) compare with AR and LA: both AR and LA have large (–) values for (+) metrics; and large (+) values for negative metrics
iv) LA is curious ... being middle-of-the-road for Predictor metrics
v) the AR large (+) evangelical value stands out compared with the NY and NJ large (–) values for the evangelical Predictor metric
vi) NJ is very urban (the most urban state)
vii) the NY and NJ diversity* values are large – no. 3 and no. 2 (HI is no. 1)
viii) contrasting the Rep and Dem states: the suicide, opioid dispensing rate, and incarceration rate differences are remark-able

similar visuals of other states' metric z-scores:
TX, AZ, FL, GA posted 02Jun2022
ND, SD, WA, OR posted 09Jun
AL, MS, CT, RI posted 14Jul
IL, IN, OH, PA posted 08Sep
NH, MA, NE, IA posted 15Sep
KS, MO, ID, MT posted 13Oct
UT, OK, CO, NM posted 10Nov
NC, SC, DE, MD posted 08Dec

1

terrykrohe OP t1_j44wkbu wrote

sources:

GDP
https://apps.bea.gov/regional/downloadzip.cfm
state taxes
https://wallethub.com/edu/states-with-highest-lowest-tax-burden/20494
suicide rate
https://www.cdc.gov/nchs/pressroom/sosmap/suicide-mortality/suicide.htm
opioid dispensing rate
https://www.cdc.gov/drugoverdose/maps/rxstate2019.html
life expectancy https://www.cdc.gov/nchs/pressroom/sosmap/life_expectancy/life_expectancy.htm
infant mortality https://www.cdc.gov/nchs/pressroom/sosmap/infant_mortality_rates/infant_mortality.htm
incarceration rate
https://www.sentencingproject.org/the-facts/#map
state+local ed spending https://www.usgovernmentspending.com/compare_state_spending_2019b20a#copypaste

evangelical
https://www.pewforum.org/religious-landscape-study/religious-tradition/evangelical-protestant/

diversity*
– diversity* = Catholic% + Jewish% + Muslim% + Asian%
– Catholic, Jewish, Muslim populations: https://www.pewforum.org/religious-landscape-study/compare/religious-tradition/by/state/
– Asian population: https://en.wikipedia.org/wiki/Demographics_of_Asian_Americans

rural-urban
– population density https://www.states101.com/populations
– agriculture income https://data.ers.usda.gov/reports.aspx?ID=17839#P9dd070795569412d9525def18d45bde2_4_185iT0R0x0

method for "rural-urban" metric
– population density and agriculture income data values were converted to "standard scores", aka "z-scores": z-score = (data value – mean)/SD (see Wikipedia, "Standard score")
– the z-scores were added and divided by 2; result = the rural/urban metric z-score
– note1: 'urban' means "increasing population density"; 'rural' means "increasing agriculture income as % of state GDP"
for the 'rural' metric to denote a "rural to urban" value, the z-scores for agriculture income were 'reversed' by multiplying by "–1"before adding to the population density z-scores
– note2: for the 'rural-urban' metric ... a negative z-score indicates a "rural" value; a positive z-score indicates an "urban" value

tool: Mathematica

***************

i) all of the metric values have been converted to z-scores; this allows dimension-less comparison and averaging
ii) the red/blue background indicates the state's Rep/Dem Electoral Vote in 2020
iii) in the calculation of the average, "negative metric" values were multiplied by "–1"

1

terrykrohe OP t1_j2a47t4 wrote

from "sources" comment below:

Missing persons and 'rural-urban' metrics: note that missing persons t-test indicates that data fluctuations are probably "random" in character. (t-test = 0.96)
The large difference of 'rural-urban' means (> 1 SD) for Rep and Dem states indicate that Rep and Dem states are different Sample populations. (un-reported t-test = 0.000126)

more about the t-test:
https://en.wikipedia.org/wiki/Student%27s_t-test

1

terrykrohe OP t1_j27zsfh wrote

... is there a relationship between a state's rural/urban character and its missing persons, yes or no? Answer: No.

Note that this is a different answer than for a state's GDP: Dem states GDP Increases with increasing urban character; Rep states GDP Decreases with increasing urban character (posted 30Dec2021). This Rep/Dem differentiation repeats with all other metrics (suicide rate, obesity, infant mortality, etc) except for missing persons.

... I should have added a few other plots for comparison

1

terrykrohe OP t1_j27z5ou wrote

... the upper right plot shows the 'rural-urban' values of the fifty states, ranked from more rural to more urban. This uses a definition described in the "sources" comment below.

... the bottom plot relates the top two plots: for each state its ('rural-urban', missing persons) coordinate is plotted. Is there a relationship? Do more missing persons come rural or urban states? The plot indicates that there is little (essentially none) relationship: tell me a state's 'rural-urban' value and I cannot tell you anything about that state's missing persons.

1

terrykrohe OP t1_j27j6ge wrote

... the t-test (0.96) quantifies only the missing persons Rep and Dem data

... there is no t-test reported for the 'rural-urban' metric: it would be very small because the means are very different)

... the point is that missing persons data is very different in character than the data of other metrics (e.g. 'rural-urban' is shown here, but GDP and others were posted previously would be similar to 'real-urban')

0

terrykrohe OP t1_j27gjbq wrote

... the missing persons data for the fifty states shows random character: the t-test value of 0.96 indicates that the means of the Rep and Dem states can be attributed to random fluctuations

... when this data is compared with data for GDP, suicide, life expectancy, etc (see post 14Apr), the contrast is impressive; and the wonder of it all is "why is there a Rep and Dem difference in the data?" (note: Rep states are always on the negative side of the comparison: less GDP, more obese, more infant mortality, etc)

1

terrykrohe OP t1_j27a6ot wrote

other comments for missing persons VS 'rural-urban'
i) The missing persons metric is the only metric which can be described as "random":
thus, it provides contrast for non-random metrics
– the non-random character of other metrics is emphasized when visualized against the missing persons visual
ii) the Alaska outlier point is curious: probably due to boating and winter incidents for which no bodies are found.

2

terrykrohe OP t1_j279xgo wrote

sources
– missing persons https://namus.nij.ojp.gov
The National Missing and Unidentified Persons System (NamUs), US Census Bureau 2020 Population Data
– population density https://www.states101.com/populations (2014 population estimates)
– agriculture income https://data.ers.usda.gov/reports.aspx?ID=17839#P9dd070795569412d9525def18d45bde2_4_185iT0R0x0

method for "rural-urban" metric
– population density and agriculture income data values were converted to "standard scores", aka "z-scores": z-score = (data value \[Dash] mean)/SD (see Wikipedia, "Standard score")
– the z-scores were added and divided by 2; result = the 'rural-urban' metric z-score
– note1: 'urban' means "increasing population density"
'rural' means "increasing agriculture income as % of state GDP"
for the 'rural' metric to denote a "rural to urban" value,
the z-scores for agriculture income were 'reversed' by multiplying by "\[Dash]1"
before adding to the population density z-scores
– note2: "NCE" is "normal curve equivalent" (see Wikipedia, "Normal curve equivalent")
tool: Mathematica

***************

top two plots
Missing persons and 'rural-urban' metrics: note that missing persons t-test indicates that data fluctuations are probably "random" in character.
The large difference of 'rural-urban' means (> 1 SD) for Rep and Dem states indicate that Rep and Dem states are different Sample populations.

the bottom plot
– Missing persons VS 'rural-urban" predictor metric: the r-value of -0.11 indicates that the data is essentially "noise" about the best-fit line.
– Note that purple is used for best-fit line, mean, and SD because the Rep and Dem states data are NOT different Sample populations.

1

terrykrohe OP t1_ixpam8r wrote

SUBSTANTIVE findings ...
1 – There is a non-random, top/bottom, Dem/Rep pattern.
2 – Rep states are always on the negative side.
3 – In all of the previous posts, there has been a nagging thought: Is there some way to quantify that some correlations are more important? The "impact" quantity does so (I think): the evangelical-incarceration correlation for the Dem states sticks out ... is this a one-time coincidence? ... will it be so for other 'response' metrics?
What did you learn about Repubs and Dems ...?
The truth of 1 and 2: Thanksgiving politics arguments are NOT just "my opinion, your opinion" – the Rep states are less productive, more obese, more suicidal, have less life expectancy, more infant mortality, more accidental deaths, receive more federal funds than they give in taxes, have a higher opioid dispensing rate, higher serious crime rate, spend less on education, have lower median incomes ... (most of the foregoing have been previously posted).
... jeez, the statistical improbability of "150 million voters, acting individually, separate(ing) the fifty states into two such disparate groups" is just plain awe-fully "mysterious".
(The most beautiful thing we can experience is the mysterious. It is the source of all true art and science. He to whom the emotion is a stranger, who can no longer pause to wonder and stand wrapped in awe, is as good as dead; his eyes are closed.)

6

terrykrohe OP t1_ixoydmd wrote

incarceration vs 'predictor' variables

  1. Purpose
    In order to 'understand' the non-random, top/bottom, Rep/Dem differentiation of metric values, eight "response" metrics are correlated with three "predictor" metrics. This post presents the 'response' variable incarceration vs the three 'predictor' variables.
    ... the eight "response" metrics: GDP, state taxes; suicide rate, opioids; life expectancy, infant mortality; incarceration, state+local ed spending;
    ... the three "predictor" metrics: 'rural-urban', evangelical, diversity*
  2. the "big picture"
    i) There is a non-random, top/bottom, Dem/Rep pattern. Patterns have reasons/causes and are mathematical.
    ii) Rep states are always on the negative side (less GDP, more suicides, lower life expectancy, etc).
    iii) How did 150 million voters, acting individually, separate the fifty states into two such disparate groups?
    iv) is there a "predictive" metric or combination of metrics which can be used to explain the characteristic Rep/Dem differences seen in the data?
  3. other comments
    i) the plots present means, standard deviations, the 'best-fit' lines, and Pearson r-values for Rep and Dem states ... P-values for the associated r-values are used to obtain a calculated 'impact' quantity. The 'impact' quantifies the significance/importance of a predictor metric.
    ii) the correlation of Dem evangelical with the Dem incarceration has an impact of 19500; which is very much larger than the other predictor/response metric combinations – in other words, increasing evangelical population of Dem states greatly influences Dem states' increasing incarceration rates; much more so than the Rep states' evangelical population influence upon incarceration rates.
    iii) the impacts of the 'rural-urban' and diversity* metrics upon the states' incarceration rates are not important
  4. Similar plots using the three 'predictor' metrics have been posted:
    for GDP (20Jan), state taxes (17Feb), suicide rate (17Mar), opioid dispensing rate (26May), life expectancy (18Aug), and infant mortality (06Oct)
9

terrykrohe OP t1_iwt65q0 wrote

the "impact" quantity
1
looking at the bottom plot ... which data set is more 'important' – the Rep states or the Dem states?
the slope of the Rep states best-fit line is larger, does that make Rep states data more significant?
the correlation value of the Rep states best-fit line is larger, does that make Rep states data more significant?

2
The correlation values of +1, 0, –1 are well-defined; but the in-between values are not well-defined: I have learned that when using r-values to give meaning to data, that r-values need to be compared "relatively". Still, I am not sure how much more meaningful is data with an r-value of 0.62 compared to data with an r-value of 0.50. It would seem that 0.62 is more meaningful than 0.50.

3
There is associated with r-values a P-value
– The r-value is a measure of the "strength" of the data used to fit the slope of the best-fit line.
– The P-value is the probability of the r-values non-randomness: a low P-value indicates that the given r-value is less likely to be due to random fluctuations; a large P-value indicates the data is more likely to be due to random fluctuations of the Population; thus, a low P-value has more significance than a large P-value.
– defining the "impact" as "r-value/P-value" yields a quantity which can be used to quantify the significance of the data points about a best-fit line.

4
For the B/W income ratio vs B/W incarceration ratio plot, the Rep best-fit line has an 'impact' of "–306" and the Dem best-fit line has an 'impact' of "–1.3".
Conclusion: The Rep states' data is more significant/important than the Dem states' data.

−1

terrykrohe OP t1_iwt54av wrote

sources
income differences
https://www.stlouisfed.org/publications/bridges/volume-3-2020/examining-us-economic-racial-inequality-by-state
incarceration differences
https://bjs.ojp.gov/content/pub/pdf/p20st.pdf (pg 10, Table 3)

tool: Mathematica

***************

– the plot points represent the fifty US states; and are colored according to their 2020 Electoral College vote
– top two plots: the dashed lines are the means; the 'boxes' are ± one standard deviation (SD) from the mean
– the parenthetical percent is the "relative standard deviation" (RSD)
– bottom plot: the ellipses are centered on the Rep/Dem means; the standard deviations are represented by the ellipses' axes
– the plot points are the (B/W income ratio, B/W incarceration ratio) coordinates for each state (excepting missing income data for three states)
– see comment below for definition of "impact"

other comments:
(this post content was suggested in a comment for a previous post; "best-fit lines, correlations: incarceration vs evangelical", posted27Oct)

i) the B/W income ratio data is pretty much the same for Rep and Dem states
ii) the B/W incarceration ratio data is about a third larger for Dem states than for Rep states; the four largest B/W incarceration ratios were for New York, New Jersey, Maryland, and Louisiana (descending incarceration order) ... an unexpected result
iii) for both Rep and Dem states, as the B/W income ratio increases, the B/W incarceration ratio decreases
iv) the Rep states' slope is larger than the Dem states' slope; also, the Rep r-value is greater than the Dem r-value
v) the B/W income ratio does NOT follow the non-random, top/bottom Rep/Dem pattern seen for previous metrics; which indicates that B/W income inequality is not biased because of a state's political orientation

2

terrykrohe OP t1_iv26yb2 wrote

sources copied from "top comment" and re-reported here:

sources
– incarceration
https://www.sentencingproject.org/the-facts/#map
diversity*
– diversity* = Catholic% + Jewish% + Muslim% + Asian%
– Catholic, Jewish, Muslim populations: https://www.pewforum.org/religious-landscape-study/compare/religious-tradition/by/state/
– Asian population: https://en.wikipedia.org/wiki/Demographics_of_Asian_Americans
tool: Mathematica

***************

– the ellipses are centered on the Rep/Dem means;
the standard deviations are represented by the ellipses' axes
– the 50 plot points represent the (diversity^*, incarceration) coordinates for each state; and are colored according to their 2020 Electoral College vote
– "r" is the Pearson correlation value

1

terrykrohe OP t1_iuzqxbj wrote

best-fit lines, correlations: incarceration vs diversity*

  1. Purpose
    In order to 'understand' the non-random, top/bottom, Rep/Dem differentiation of metric values, eight "response" metrics are correlated with three "predictor" metrics. This post presents the 'response' variable incarceration vs the diversity* 'predictor' metric.
    ... the eight "response" metrics: GDP, state taxes; suicide rate, opioids; life expectancy, infant mortality; incarceration, state+local ed spending;
    ... the three "predictor" metrics: 'rural-urban', evangelical, diversity*
  2. the "big picture"
    i) There is a non-random, top/bottom, Dem/Rep pattern. Patterns have reasons/causes and are mathematical.
    ii) Rep states are always on the negative side (less GDP, more suicides, lower life expectancy, etc).
    iii) How did 150 million voters, acting individually, separate the fifty states into two such disparate groups?
    iv) is there a "predictive" metric or combination of metrics which can be used to explain the characteristic Rep/Dem differences seen in the data?
  3. general comments
    i) Dem states are twice as "diverse" as Rep states
    ii) for both Rep states and Dem states: as diversity* increases, the incarceration rate decreases; though the Dem states 'best-fit' line is more convincing (has a larger r-value)
    iii) Hawaii has the largest diversity* rating – due to its large Asian population
1

terrykrohe OP t1_iuzqkef wrote

sources
– incarceration
https://www.sentencingproject.org/the-facts/#map
diversity*
– diversity* = Catholic% + Jewish% + Muslim% + Asian%
– Catholic, Jewish, Muslim populations: https://www.pewforum.org/religious-landscape-study/compare/religious-tradition/by/state/
– Asian population: https://en.wikipedia.org/wiki/Demographics_of_Asian_Americans
tool: Mathematica

***************
– the ellipses are centered on the Rep/Dem means;
the standard deviations are represented by the ellipses' axes
– the 50 plot points represent the (diversity*, incarceration) coordinates for each state;
and are colored according to their 2020 Electoral College vote
– "r" is the Pearson correlation value

1

terrykrohe OP t1_iu5f3ot wrote

I found data from the StLouis Fed Reserve and from the sentencing project source ... I will work something up

It would be a plot of black/white income ratio VS black/white incarceration ratio using the Rep/Dem state differentiation.

... thanks for the suggestion – it should be interesting

3

terrykrohe OP t1_iu35gp9 wrote

1
... the posts do have a uniform style; but the content of each post is unique

2
... the purpose of the post is explained in a previous comment: to repeat: there is a non-random top/bottom Rep/Dem differentiation of the data; in which, the Rep states are always on the negative side. This generality is worth noting and, even more worthy, is an investigation using other metrics to (maybe) unearth an explanation.

Montaigne, "Of Pedantry": has a quoted comment– 'I hate above all pedantic learning'; to which he adds – We labor only to fill our memory, and leave the understanding and conscience empty.

3
... this Rep/Dem differentiation has been noted elsewhere:
https://www.thirdway.org/report/the-red-state-murder-problem
(see post 08Jul2021)
https://www.theguardian.com/us-news/2022/oct/27/life-expectancy-us-conservative-liberal-states
(see post 29Jul2021)

2