新增一個列到dataframe中
We had the following (simplified) DataFrame containing some information about customers on board the Titanic:
|
We wanted to add a ‘Survived’ column to that by doing a lookup in the survival_table below to work out the appropriate value:
|
To do this we can use the function which allows us to map over each row.
Our initial attempt read like this:
|
When we ran that we got the following exception:
|
After much googling and confusion as to why we were getting this error I tried printing out the result of calling apply rather than immediately assigning it and realised that the output wasn’t what I expected:
|
I’d expected to get one column showing the survived values but instead we’ve got a 2×2 DataFrame. Adding some logging to the calculate_survival function revealed why:
|
Our function is actually returning a Series object rather than a single value 0 or 1 which I found surprising. We can use the iat function to retrieve a scalar value from a Series:
|
Now if we assign the output of that function like before it works as expected:
|