Name your child for success!

1 comment:
Shakespeare famously posed the question:
"What's in a name?"
The answer may actually be: quite a bit!

Your given name (and your family name, for that matter) likely contains a lot of subtle information about you and your history. For example, we often assume names correlate with gender (as I have in previous articles)... Except the gender identity of some common names has changed over history (examples here)! Your name may also correlate with your political affiliation or what job you have.

I recently wondered: do certain names correlate with brilliance or high intellectual achievement? 

To find out, I gathered a large dataset of full names from people with PhDs in science (from the IAU and AAAS), as well as the names of lawyers using several recent years of bar exam "pass lists" provided by WA, NY, and TX. In total I was able to easily (read: quickly) gather over 36,000 full names of scientists and lawyers!

With this corpus of highly educated names in hand, let's look at which are most common!

The most common names of scientists:

Right away you can see a dramatic trend: it's mostly dudes. In fact, of the top 100 most common names for scientists, only 14 are female!

The most common names of lawyers:

While mens names still dominate, there are definitely more women in top list. For comparison, of the top 100 most common names for lawyers, 50 are female! That difference is shocking to me.

Lawyers have more diverse names

Here I compare the distribution of name frequencies between the datasets. For the top 500 names of  Lawyers and Scientists, I've counted the occurrence rate of each name. You can clearly see that the Scientists (red line) are much more concentrated in the first 10-ish names. (Actually this is visible in the wordle's above also) This might seem surprising, given that the lawyer names only come from 3 states in the US, and the scientists come from all over the world.

I next wondered, how does the rank of scientist and lawyer names compare? Obviously names like "Michael" are high in both lists, but how well are they correlated? The answer: Not Very Well!

Huh! Something else is at work here. These names, scientists and lawyers, obviously do not come from the same distributions. I then realized: the scientists sampled are more "senior", while the lawyers are all very "young". The two groups may be separated by over 25 years age difference. This lack of name rank correlation, and maybe the different top names, might be related to the era these names are from. The larger number of women's names among lawyers surely is due to a great increase in the fraction of female attorneys compared to yesteryear.

To investigate this further, I gathered two "ground truth" datasets from the Social Security Administration (SSA): The 100 top baby names over the past 100 years, and the 100 top baby names of 1989. These datasets were broken down by gender, so the rest of my analysis is also.

Here I'm showing the rank of the top 100 names for lawyers, compared with the name rank according to the SSA. For lawyers, male name ranks correlate better with 1989 than the 100-year historic set. Female name ranks are slightly better associated with the historic 100-year set.

For scientist (again, note the dearth of women's names), male name ranks are much more correlated with the 100-year historic set, while female names prefer the 1989 data.

The top names are...

Female Scientists:

Male Scientists:

Female Lawyers:

Male Lawyers:

The Dream of Spaceflight

No comments:
Me and fellow member of team "Gagarin", at Space Camp (Huntsville, AL) circa 2001
This past Friday during a test flight over the Mojave desert the privately funded SpaceShipTwo crashed, killing one test pilot and severely injuring another. Combined with the destruction of a privately built Antares rocket during launch (thankfully nobody injured), it's been a hard week for the business of space.

Some worry these two accidents have been a damning setback to the dream. Here the dream I'm referring to is making space travel common, available to everyone. And to be clear, it's a LONG ways off... Currently it costs between $1 and $10 per lb to fly on commercial airlines (of course, we don't buy tickets by the lb, but its a round number). By comparison, launching something in to space is over 1000X more expensive, around $10,000 per lb. When private companies are able to make spaceflight an everyday reality, we'll be a lot closer to this dream.

Around 540 people have flown in space (and perhaps a few more if you believe militaries have flown secret missions). That's it! Fewer than 600 humans have ever left our world. Given that the World population is still increasing, I wondered:  Has the number of people who have flown in to space actually kept up with population growth?

In other words, even though it's insanely expensive to launch people, and very few have been fortunate enough to go, are we making any progress on the dream?

Here's a visual timeline of all (publicly known) humans ever launched into space, from Wikipedia:

The number of spaceflight missions per year (both crewed and not) reached its peak in the early 1960's. It's an encouraging sign the recent trend has reversed, and that the number of missions per year has continued to grow for the past decade.
From Wikipedia

An Increase of People

For reference, here's the growth of the World population since the beginning of the human spaceflight era according to the US Census Bureau. For spaceflight to become common place, we have to be launching new people faster than this curve.
World Population since 1957

I was able to grab some nicely formatted table containing the name of every known astronaut (and all other varieties of 'naut), as well as their first launch date. Here is the cumulative number of people who have flown in to space over time:
Number of people in space over time
This is such a neat graph! It encodes a fast amount of the history of human spaceflight. Note this does not count the total number of times people have flown, since many astronauts have flown on multiple missions and are only counted once here. See the slight flattening around 1967? That's because the US didn't fly any missions in 1967 after the tragic Apollo 1 fire. My parents were kids then.

You can see a steady linear growth in the number of space-people throughout the 1960's and 1970's, even after the Apollo missions to the Moon ended in 1972. The cumulative curve takes a sharp bend upwards in the early 1980's when the Space Shuttle program begins. I show up around this time as well.

This growth is ambitious, and by 1985 the US is launching more humans in to space than ever before (or since) per year. In 1986 there is another sobering bend in this curve. I was 3 years old when Challenger exploded. The Shuttle program endured, the cumulative curve began to grow again. There's a much smaller bend in the curve in 2003. I was in college when Columbia disintegrated during reentry.

I don't mention these accidents to make light of them, or to diminish the horrible tragedy of last Friday. Going into space is dangerous business. Even testing and training for it is dangerous. For spaceflight to become commonplace, it must be both affordable and safe. 

Are We Approaching The Dream?

So graphically speaking, the goal is to ensure this curve with the cumulative number of space travelers grows faster than the population. (I realize this is actually a huge over-simplification, and somewhat incorrect) Nevertheless, here is what dividing these curves looks like, that is [cumulative number of people traveled to space] / [world population] each year.

Thankfully this curve is positive, though given the small numbers of space travelers it's not surprising. (Apologies for the odd units on the vertical axis). This graph says that overall we've been winning, that truly more of our species have been space travelers over time (a higher fraction). Maybe one day I'll be lucky enough to count myself among them.

The curve has slowed in recent years. If you were to simply fit a straight line to this fraction over the past +50 years (and I did), you expect that we'll reach 1 in every 1000 humans to travel in space around the year..... 600,000 AD.

This linear fit is obviously (or hopefully) an absurd model for how we expect the fraction of space travelers to grow. The curve should begin to grow again once programs like Orion (or Dream Chaser) come online. Hopefully we see a more steep increase with these new missions, like we saw in the early 1980's with the Shuttle. Private missions like SpaceShipTwo will help make this exponential growth a reality. And to paraphrase a well-known saying, exponential growth is the most powerful force in the Universe.

How Long Between Apple Releases?

No comments:
As an Apple computer fan I've been a long time reader of Mac Rumors, a website that reports great Mac news and rumors. One great feature is their Buyer's Guide, which tracks the refresh history data (and rumors) to suggest when products are due for an upgrade.

I've been wondering if a) Apple products are actually getting cheaper over time, and b) if Apple is refreshing products faster?

Here's the base cost versus the days since last refresh for a bunch of current Apple products on Mac Rumors

Random Forest for Time Series Forecasting

No comments:
I recently spent a week at the 2014 Astro Hack Week, a week-long summer school + hack event full of astronomers (and some brave others). The week was full of high level chats about statistics, data analysis, coffee, and astrophysics. There was a great crowd of people, many of whom you can (and should) follow on Twitter. Below is a quick post I wrote up detailing one of my afternoon "hack projects", which was originally posted on the HackWeek's blog here.

After Josh Bloom's wonderful lecture on Random Forest regression I was excited to out his example code on myKepler data. Josh explained regression with machine learning as taking many data points with a variety of features/atributes, and using relationships between these features to predict some other parameter. He explained that the Random Forest algorithm works by constructing many decision trees, which are used to construct the final prediction.

I wondered: could I use the Random Forest (RF) to do time series forecasting? Of course, as Jake noted, RF only predicts single properties. As a result, RF isn't a good choice for doing trend forecasting over long time periods. (well, maybe) Instead, this would use RF to just predict the next datapoint.

Related Posts Plugin for WordPress, Blogger...