Here is an analysis of all the genetic cohorts from the Imperial era compared to Italian_South_ITS7.HO
Code:
> results$weights
# A tibble: 4 × 5
target left weight se z
<chr> <chr> <dbl> <dbl> <dbl>
1 Italian_South_ITS7.HO C7_Italy_Imperial_oCentralEuropean.SG -0.628 1.39 -0.452
2 Italian_South_ITS7.HO C6_Italy_Imperial_Central_Med.SG 3.82 4.97 0.769
3 Italian_South_ITS7.HO C5_Italy_Imperial_East_Med.SG -2.49 3.16 -0.786
4 Italian_South_ITS7.HO C4_Italy_Imperial_Near_East.SG 0.293 0.994 0.295
> results$popdrop
# A tibble: 15 × 15
pat wt dof chisq p f4rank C7_Italy_Imperial_oCentral…¹ C6_Italy_Imperial_Ce…² C5_Italy_Imperial_Ea…³ C4_Italy_Imperial_Ne…⁴ feasible best
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl>
1 0000 0 17 7.95 9.68e- 1 3 -0.628 3.82 -2.49 0.293 FALSE NA
2 0001 1 18 8.90 9.62e- 1 2 -1.32 6.35 -4.03 NA FALSE TRUE
3 0010 1 18 17.7 4.73e- 1 2 -56.5 139. NA -81.9 FALSE TRUE
4 0100 1 18 27.1 7.65e- 2 2 0.736 NA -1.69 1.95 FALSE TRUE
5 1000 1 18 12.2 8.38e- 1 2 NA 2.07 -2.03 0.958 FALSE TRUE
6 0011 2 19 68.0 1.99e- 7 1 -0.0961 1.10 NA NA FALSE NA
7 0101 2 19 110. 9.22e-15 1 0.354 NA 0.646 NA TRUE NA
8 0110 2 19 62.9 1.34e- 6 1 0.433 NA NA 0.567 TRUE NA
9 1001 2 19 38.2 5.53e- 3 1 NA 1.86 -0.864 NA FALSE NA
10 1010 2 19 62.9 1.31e- 6 1 NA 1.12 NA -0.121 FALSE NA
11 1100 2 19 69.6 1.06e- 7 1 NA NA 2.70 -1.70 FALSE NA
12 0111 3 20 196. 6.28e-31 0 1 NA NA NA TRUE NA
13 1011 3 20 77.4 1.07e- 8 0 NA 1 NA NA TRUE NA
14 1101 3 20 239. 1.89e-39 0 NA NA 1 NA TRUE NA
15 1110 3 20 396. 1.45e-71 0 NA NA NA 1 TRUE NA
# ℹ abbreviated names: ¹C7_Italy_Imperial_oCentralEuropean.SG, ²C6_Italy_Imperial_Central_Med.SG, ³C5_Italy_Imperial_East_Med.SG,
# ⁴C4_Italy_Imperial_Near_East.SG
# ℹ 3 more variables: dofdiff <dbl>, chisqdiff <dbl>, p_nested <dbl>
>
>
ChatGPT:
This output appears to be from a statistical analysis, most likely related to genetic ancestry or population genetics. The analysis seems to be focused on determining the contributions of various ancient or historical populations (i.e., sources) to the present-day genetic makeup of "Italian_South_ITS7.HO" (likely a modern Southern Italian population or individual sample). Here's what I interpret from the provided output:
1. **results$weights**:
- This table shows the estimated weight (contribution) of each ancient source to the target (Italian_South_ITS7.HO).
- "weight" indicates the estimated percentage contribution from each source, though it's worth noting that some weights are negative, which typically isn't meaningful in a genetic context and might indicate issues with the model or the populations being tested.
- "se" is the standard error associated with each weight, which provides an idea of the uncertainty around the weight estimates.
- "z" is a z-score, calculated as the weight divided by the standard error. This gives an idea of how many standard deviations the estimate is from zero, which can be used to assess significance.
2. **results$popdrop**:
- This table seems to be from a model testing different combinations of source populations to see how well they fit the target population's genetic data.
- "pat" indicates which source populations are included in the model (1 for included, 0 for excluded).
- "wt" is the number of sources included in the model.
- "dof" stands for degrees of freedom, which is a measure used in statistical tests.
- "chisq" is the chi-squared statistic, which measures the goodness of fit of the model. A higher value usually indicates a worse fit.
- "p" is the p-value associated with the chi-squared statistic. Smaller p-values suggest that the model is a significant improvement over a null model.
- "f4rank" is likely a rank statistic based on the f4-statistic, commonly used in population genetics.
- The columns with abbreviated names show the estimated contributions (weights) of each source population for that particular model.
- "feasible" indicates if the model is feasible (i.e., if the weights make sense and there are no issues like negative contributions).
- "best" might be indicating the best-fitting models based on some criteria.
**Interpretation**:
From the data:
- The first ancient source (C7_Italy_Imperial_oCentralEuropean.SG) has a negative weight, which is not typically meaningful in this context.
- The second source (C6_Italy_Imperial_Central_Med.SG) has the highest positive weight, suggesting it might be the most significant contributor to the modern Italian_South_ITS7.HO population.
- Several models with different combinations of sources have been tested. Some are feasible, while others aren't.
- The models with the lowest p-values (especially those close to zero) suggest that those combinations of sources are the most likely contributors to the Italian_South_ITS7.HO population.
In summary, this analysis aims to determine which ancient or historical populations contributed to the present-day genetic makeup of a Southern Italian population or individual. The results suggest that the "C6_Italy_Imperial_Central_Med.SG" population might be the primary contributor, but more context (and possibly additional analyses) is needed for a definitive conclusion.