Why Animate#
Here are some reasons you may want to use ahlive to animate your data to life!
add dimension#
Data can be mapped to color, and also time!
[1]:
import ahlive as ah
import xarray as xr
ds = xr.tutorial.open_dataset("air_temperature")
ds["month"] = ds["time"].dt.strftime("%m - %b")
ds = ds.groupby("month").mean()
ds_kcmi = ds["air"].sel(lon=-88.5 + 360, lat=40, method="nearest")
tmpf_inline = ds_kcmi * 1.8 - 459.67
ah.Dataset(
ds,
xs="lon",
ys="lat",
cs="air",
state_labels="month",
title="2013-2014 Monthly Averaged Temperatures",
projection="LambertConformal",
coastline=True,
revert="boomerang",
fps=10
).reference(
x0s=[-88.5] * 12, # help wanted to simplify this method!
y0s=[40] * 12,
inline_labels=tmpf_inline
).config(**{
"ref_inline": dict(suffix="°F", color="black"),
"ref_plot": dict(color="black"),
"state": dict(color="black")
}).render()
[1]:
tell story#
Present the plot, piece by piece, and highlight the important parts to tell a short story.
[2]:
import ahlive as ah
df = ah.open_dataset(
label="iem_asos",
stn=["KDEN"],
ini="2019-10-09",
end="2019-10-10",
tz="mst",
data="tmpf",
verbose=True
).dropna(subset=["tmpf"])
ah.DataFrame(
df,
xs="valid",
ys="tmpf",
xlims="fixed",
ylim0s="explore",
inline_labels="tmpf",
ylabel="°F",
ymargins=(0.3, 0.15),
title="2019-10-09 Temperature at Denver Airport",
).remark(
"Sunrise at 7 AM",
xs="2019-10-09 06:53"
).remark(
"High reached at 3 PM",
ys=82,
durations=3
).remark(
"Low reached at 11 PM, change of 52°F within 8 hours!",
ys=30,
first=True,
durations=3,
).config(
"inline",
suffix="°F"
).render()
IEM ASOS
Source: Iowa Environment Mesonet ASOS
https://mesonet.agron.iastate.edu/ASOS/
Description: The IEM maintains an ever growing archive of automated airport weather observations from around the world! These observations are typically called 'ASOS' or sometimes 'AWOS' sensors. A more generic term may be METAR data, which is a term that describes the format the data is transmitted as. If you don't get data for a request, please feel free to contact us for help. The IEM also has a one minute interval dataset for US ASOS (2000-) and Iowa AWOS (1995-2011) sites. This archive simply provides the as-is collection of historical observations, very little quality control is done.
Data: https://mesonet.agron.iastate.edu/cgi-bin/request/asos.py?station=KDEN&data=tmpf&latlon=no&elev=no&year1=2019&month1=10&day1=09&year2=2019&month2=10&day2=10&tz=America/Denver&format=onlycomma&missing=empty&trace=empty&direct=no&report_type=1&report_type=2
/home/docs/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/ahlive-1.0.4.post1+dirty-py3.11.egg/ahlive/data.py:1045: UserWarning: Converting non-nanosecond precision timedelta values to nanosecond precision. This behavior can eventually be relaxed in xarray, as it is an artifact from pandas which is now beginning to support non-nanosecond precision values. This warning is caused by passing non-nanosecond np.datetime64 or np.timedelta64 values to the DataArray or Variable constructor; it can be silenced by converting the values to nanosecond precision ahead of time.
/home/docs/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/ahlive-1.0.4.post1+dirty-py3.11.egg/ahlive/easing.py:193: FutureWarning: In the future `np.object` will be defined as the corresponding NumPy scalar.
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[2], line 38
1 import ahlive as ah
3 df = ah.open_dataset(
4 label="iem_asos",
5 stn=["KDEN"],
(...)
10 verbose=True
11 ).dropna(subset=["tmpf"])
13 ah.DataFrame(
14 df,
15 xs="valid",
16 ys="tmpf",
17 xlims="fixed",
18 ylim0s="explore",
19 inline_labels="tmpf",
20 ylabel="°F",
21 ymargins=(0.3, 0.15),
22 title="2019-10-09 Temperature at Denver Airport",
23 ).remark(
24 "Sunrise at 7 AM",
25 xs="2019-10-09 06:53"
26 ).remark(
27 "High reached at 3 PM",
28 ys=82,
29 durations=3
30 ).remark(
31 "Low reached at 11 PM, change of 52°F within 8 hours!",
32 ys=30,
33 first=True,
34 durations=3,
35 ).config(
36 "inline",
37 suffix="°F"
---> 38 ).render()
File ~/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/ahlive-1.0.4.post1+dirty-py3.11.egg/ahlive/animation.py:1998, in Animation.render(self)
1997 def render(self):
-> 1998 self_copy = self.finalize()
1999 try:
2000 data = self_copy.data
File ~/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/ahlive-1.0.4.post1+dirty-py3.11.egg/ahlive/data.py:1417, in Data.finalize(self)
1415 ds = self_copy._precompute_base(ds, chart) # must be after config chart
1416 ds = self_copy._add_geo_tiles(ds) # before interp
-> 1417 ds = self_copy._interp_dataset(ds)
1418 ds = self_copy._add_geo_transforms(ds, chart) # after interp
1419 ds = self_copy._add_geo_features(ds)
File ~/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/ahlive-1.0.4.post1+dirty-py3.11.egg/ahlive/data.py:1115, in Data._interp_dataset(self, ds)
1114 def _interp_dataset(self, ds):
-> 1115 ds = ds.map(self.interpolate, keep_attrs=True)
1117 if "s" in ds:
1118 ds["s"] = fillna(ds["s"].where(ds["s"] >= 0), how="both")
File ~/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/xarray-2023.4.2-py3.11.egg/xarray/core/dataset.py:5964, in Dataset.map(self, func, keep_attrs, args, **kwargs)
5962 if keep_attrs is None:
5963 keep_attrs = _get_keep_attrs(default=False)
-> 5964 variables = {
5965 k: maybe_wrap_array(v, func(v, *args, **kwargs))
5966 for k, v in self.data_vars.items()
5967 }
5968 if keep_attrs:
5969 for k, v in variables.items():
File ~/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/xarray-2023.4.2-py3.11.egg/xarray/core/dataset.py:5965, in <dictcomp>(.0)
5962 if keep_attrs is None:
5963 keep_attrs = _get_keep_attrs(default=False)
5964 variables = {
-> 5965 k: maybe_wrap_array(v, func(v, *args, **kwargs))
5966 for k, v in self.data_vars.items()
5967 }
5968 if keep_attrs:
5969 for k, v in variables.items():
File ~/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/ahlive-1.0.4.post1+dirty-py3.11.egg/ahlive/easing.py:79, in Easing.interpolate(self, da, name)
77 array_dtype = array.dtype
78 if name in ["duration", "remark", "xerr", "yerr"] and not is_errorbar_morph:
---> 79 result = self._interp_first(
80 array, num_states, num_steps, num_items, num_result, name
81 )
82 elif interp == "fill" or name.endswith(
83 ("zoom", "discrete_trail", "morph_trail", "tick_label", "bar_label")
84 ):
85 result = self._interp_fill(array, num_states, num_steps, name)
File ~/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/ahlive-1.0.4.post1+dirty-py3.11.egg/ahlive/easing.py:193, in Easing._interp_first(self, array, num_states, num_steps, num_items, num_result, name)
191 if is_str(array):
192 fill = ""
--> 193 dtype = np.object
194 else:
195 fill = 0.0
File ~/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/numpy/__init__.py:305, in __getattr__(attr)
300 warnings.warn(
301 f"In the future `np.{attr}` will be defined as the "
302 "corresponding NumPy scalar.", FutureWarning, stacklevel=2)
304 if attr in __former_attrs__:
--> 305 raise AttributeError(__former_attrs__[attr])
307 # Importing Tester requires importing all of UnitTest which is not a
308 # cheap import Since it is mainly used in test suits, we lazy import it
309 # here to save on the order of 10 ms of import time for most users
310 #
311 # The previous way Tester was imported also had a side effect of adding
312 # the full `numpy.testing` namespace
313 if attr == 'testing':
AttributeError: module 'numpy' has no attribute 'object'.
`np.object` was a deprecated alias for the builtin `object`. To avoid this error in existing code, use `object` by itself. Doing this will not modify any behavior and is safe.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
Adding extra dimensionality to a visualization does not equate to better! Remember, the primary goal of a data visualization is to communicate an aspect of the data effectively.
show progression#
What better way to show change over time… than over the time dimension!
[3]:
import ahlive as ah
df = ah.open_dataset(
"owid_nuclear_weapons_tests_arms_control_association_2020",
names=["country", "year", "count"], verbose=True
)
df = df.loc[df["country"].isin(["Russia", "United States"])]
ah.DataFrame(
df,
xs="country",
ys="count",
label="country",
title="Nuclear Arms Test",
inline_labels="count",
state_labels="year",
chart="barh",
preset="race",
xlim1s="explore",
figsize=(12, 5),
frames=5
).config(**{
"spacing": dict(left=0.2),
"durations": dict(transition_frames=0.08)
}).render()
NUCLEAR WEAPONS TESTS – ARMS CONTROL ASSOCIATION (2020)
Source: Arms Control Association curated by Our World in Data (OWID)
https://www.armscontrol.org/factsheets/nucleartesttally through https://github.com/owid/owid-datasets
Description: This datasets provides the number of nuclear weapons tests by country using data from the Arms Control Association (2020).
You can download the code and complete dataset, including supplementary variables, from GitHub: https://github.com/owid/notebooks/tree/main/BastianHerre/nuclear_weapons
Data: https://raw.githubusercontent.com/owid/owid-datasets/master//datasets/Nuclear%20weapons%20tests%20%E2%80%93%20Arms%20Control%20Association%20%282020%29/Nuclear%20weapons%20tests%20%E2%80%93%20Arms%20Control%20Association%20%282020%29.csv
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
TypeError: len() of unsized object
Exception ignored in: 'pandas._libs.lib.is_string_array'
Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/pandas-2.0.1-py3.11-linux-x86_64.egg/pandas/core/dtypes/common.py", line 1724, in is_all_strings
and lib.infer_dtype(value, skipna=False) == "string"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: len() of unsized object
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
TypeError: len() of unsized object
Exception ignored in: 'pandas._libs.lib.is_string_array'
Traceback (most recent call last):
File "/home/docs/checkouts/readthedocs.org/user_builds/ahlive/conda/latest/lib/python3.11/site-packages/pandas-2.0.1-py3.11-linux-x86_64.egg/pandas/core/dtypes/common.py", line 1724, in is_all_strings
and lib.infer_dtype(value, skipna=False) == "string"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: len() of unsized object
[3]:
emphasize extremes#
By having dynamic limits, extremes can be easily highlighted.
[4]:
import ahlive as ah
df = ah.open_dataset(
"owid_conflict_deaths_by_country_ucdp_2019",
names=["country", "year", "deaths"], verbose=True
)
df = df.loc[df["country"].isin(df.nlargest(7, "deaths")["country"].unique())]
ah.DataFrame(
df,
xs="year",
ys="deaths",
label="country",
state_labels="year",
inline_labels="deaths",
title="Conflict Deaths by Country",
chart="scatter",
preset="trail",
join="cascade",
ylim1s="explore",
xlim1s="fixed",
figsize=(10, 5),
frames=5,
fps=30,
).config(
"preset", chart="both"
).render()
CONFLICT DEATHS BY COUNTRY – UCDP (2019)
Source: Sundberg, Ralph, and Erik Melander, 2013, “Introducing the UCDP Georeferenced Event Dataset”, Journal of Peace Research, vol.50, no.4, 523-532 curated by Our World in Data (OWID)
https://ucdp.uu.se/downloads/ through https://github.com/owid/owid-datasets
Description: Aggregation of the 'best' estimates for deaths listed for all incidents in UCDP GED 19.1 for each country and year. UCDP defines incidents as being "where armed force was used by an organised actor against another organized actor, or against civilians, resulting in at least 1 direct death at a specific location and a specific date".
Note that currently the data excludes Syria.
Data: https://raw.githubusercontent.com/owid/owid-datasets/master//datasets/Conflict%20deaths%20by%20country%20%E2%80%93%20UCDP%20%282019%29/Conflict%20deaths%20by%20country%20%E2%80%93%20UCDP%20%282019%29.csv
[4]:
engage audience#
Sometimes, it’s just more intriguing!
[5]:
import ahlive as ah
import matplotlib.pyplot as plt
from cycler import cycler
df = ah.open_dataset(
"owid_causes_of_death_vs_media_coverage_shen_et_al_2018", verbose=True,
names=["cause", "year", "actual", "google search", "nyt", "guardian"]
).dropna().melt(["cause", "year"], var_name="source", value_name="share")
df["source"] = df["source"].str.upper()
cmap = plt.get_cmap("plasma", df["cause"].nunique())
plt.rc("axes", prop_cycle=cycler("color", cmap.colors))
sources = df["source"].unique()
ah.layout([
ah.DataFrame(
df.loc[df["source"] == source],
xs="share",
label="cause",
title=source,
xlabel=" ",
suptitle="Shen et al. (2018) Death: Reality vs Reported",
chart="pie",
legend=False,
state_labels="year",
inline_labels="cause",
figsize=(15, 12),
fps=5,
autopct="%.0f%%",
normalize=True,
) for source in sources
]).config(**{
"state": dict(xy="suptitle"),
"inline": dict(
offset=1.25,
color="black",
fontsize=12,
clip_on=False
)
}).cols(2).render()
CAUSES OF DEATH VS. MEDIA COVERAGE (SHEN ET AL. 2018)
Source: Shen et al. (2018). Death: Reality vs Reported. curated by Our World in Data (OWID)
https://owenshen24.github.io/charting-death/ through https://github.com/owid/owid-datasets
Description: Shen et al. (2018) compared the leading causes of death in the United States as their share of total deaths relative to Google searches and media coverage in The New York Times (NYT) and The Guardian newspaper. For this analysis they selected the top 10 causes of death in the USA in addition to terrorism, homicide, and drug overdoses (which they assumed to also receive significant media attention).
Data each causes' share of total deaths in the USA was assessed based on the Centers for Disease Control and Prevention (CDC) WONDER database for public health, available at: https://wonder.cdc.gov/. This is available from 1999 to 2016. Combined, the 13 causes of death assessed in this analysis account for approximately 88% of all deaths in the USA.
Data on Google searches was derived from Google Trends (available from 2004 to 2016). This was assessed on the number of searches for these terms and close synonyms.
The New York Times and The Guardian media coverage was assessed from both newspapers' article databases. Here the authors searched the database for a list of all articles which contained the word anywhere (headline or body).
All values are normalized to 100% so they represent their relative share of the top causes, rather than absolute
counts (e.g. ‘deaths’ represents each causes’ share of deaths within the 13 categories shown rather than total deaths). This allows for us to compare the relative representation of different sources.
Full methodology, notes and open-access data on GitHub are available from the original source: https://owenshen24.github.io/charting-death/.
Data: https://raw.githubusercontent.com/owid/owid-datasets/master//datasets/Causes%20of%20death%20vs.%20media%20coverage%20%28Shen%20et%20al.%202018%29/Causes%20of%20death%20vs.%20media%20coverage%20%28Shen%20et%20al.%202018%29.csv
[5]: