Making a pairwise distance matrix in pandas

This is a somewhat specialized problem that forms part of a lot of data science and clustering workflows. It starts with a relatively straightforward question: if we have a bunch of measurements for two different things, how do we come up with a single number that represents the difference between the two things?

An example […]

Pandas and Sklearn

pandas isnull函数检查数据是否有缺失 pandas isnull sum with column headers

 

for col in main_df: print(sum(pd.isnull(data[col])))

I get a list of the null count for each column:

0 1 100

What I’m trying to do is create a new dataframe which has the column header alongside the null count, e.g.

col1 | 0 col2 | 1 col3 […]