Print to PDF: Click the button below → Save as PDF (1080×1080px per slide).
Upload to LinkedIn: New post → document icon → upload PDF → swipeable carousel.
Post tip: Paste post text from linkedin_posts.html → paper link in first comment.
Slide 1 / 8
239 engineers · 3 projects · 21 statistical tests
What the data found
Three things to help engineers grow.
All readable from the Jira and git data your team already has.
1 / 8
Slide 2 / 8
Apache Camel · Spark · Hadoop · 2007–2022
Three predictors of growth. None of them are training programs.
1
Give whole problems, not fragments.
Growing engineers own problems end-to-end. They do half the sub-task work of their peers.
Spark: 7.7% vs 13.9% sub-task rate · p=0.02
2
Escalate the difficulty over time.
Growing engineers' resolution time rises as work gets harder — not because they slow down.
Days delta: +54d vs +8d · p<0.001
3
Create a range, not just difficulty.
Hadoop tickets are 18× harder than Spark's — yet Hadoop has zero growth signal. Where everyone does hard work, no rising arc is detectable.
21 tests · 0 significant predictors in Hadoop
2 / 8
Slide 3 / 8
Apache Spark · contributor communication network · 372 engineers
Node size = comment volume. Green = Growing · Amber = Coasting · Grey = No change. The largest, most connected nodes are amber. Growing engineers are the small quiet ones — deep in the work, low in network visibility.
Median complexity: Hadoop 0.18 · Spark 0.03 · Camel 0.01. Hadoop tickets cluster in a uniformly high band — no easy tier, no gradient to grow across. Hard environment, but zero growth signal.
Solid = Growing, hatched = Other. Stars = statistically significant difference. Spark shows clear separation on Days delta (***) and Sub-task ratio (*). Hadoop: no separation on any factor — the complexity floor masks all signal.
Hadoop tickets are 6–18× harder by every measure. Yet not a single variable predicts growth.
When the distribution is uniformly high, the within-engineer variance collapses — there is no rising arc for the data to detect. Spark and Camel, with wide complexity ranges, both show clear signal.
The data can't prove causation here — but the pattern is consistent: measurable growth needs a measurable gradient.
7 / 8
Slide 8 / 8
task2vec
What does this look like in your team's data?
Sub-task ratio. Scope delta. Ticket difficulty arc.
Computable from the Jira and git history you already have.
No surveys. No new tooling. No performance theatre.