wiki:JNB_Notes

Version 8 (modified by darkblueb, 6 years ago) ( diff )

--

Why Notebooks on OSGeoLive 12 ?

  • Jupyter Notebooks are popular and increasingly mainstream.
  • Its the Data
  • Notebooks place an emphasis on individual learning and experimentation
  • Spatial is special; maps and geo data are uniquely suited to visualization

Why Jupyter is data scientists’ computational notebook of choice

You may have noticed the column recently in the Journal Nature

https://www.nature.com/articles/d41586-018-07196-1

Leading technical Universities are building on Data Science (and use Notebooks to teach!)

https://www.insidehighered.com/news/2018/11/02/big-data-ai-prompt-major-expansions-uc-berkeley-and-mit

Berkeley Division of Data Science https://news.berkeley.edu/2018/11/01/berkeley-inaugurates-division-of-data-science-and-information-connecting-teaching-and-research-from-all-corners-of-campus/

The Big Picture -- Data is Fourth Fundemental Paradigm of Science

Some have described modern scientific practice as three paradigms:

  • empirical observation and experimentation
  • analytical or theoretical approaches
  • computational science or simulation

A Fourth Paradigm is introduced in a 2009 book published by Microsoft Research

The Fourth Paradigm: Data-Intensive Scientific Discovery

This book presents the first broad look at the rapidly emerging field of dataintensive science, with the goal of influencing the worldwide scientific and computing research communities and inspiring the next generation of scientists. Increasingly, scientific breakthroughs will be powered by advanced computing capabilities that help researchers manipulate and explore massive datasets. The speed at which any given scientific discipline advances will depend on how well its researchers collaborate with one another, and with technologists, in areas of eScience such as databases, workflow management, visualization, and cloudcomputing technologies. This collection of essays expands on the vision of pioneering computer scientist Jim Gray for a new, fourth paradigm of discovery based on data-intensive science and offers insights into how it can be fully realized.

—Rhys Francis, Australian eResearch Infrastructure Council

The script languages R and Python are the tools of choice for analysis and summary of data. Jupyter supports R, Python and Julia (math) as first-class citizens; other kernels available as community-supported modules.

Individuals, standards and accessibility are at the core of FOSS

Learning environments, from Literate Programming to Mindstorms, to distributed learning.

Virtuous Cycle of technical contributions due to ease-of-use and popularity.

Exploration of a problem, library of code, concepts, data relationships at the fingertips.

Standards-friendly environment (contrast to Matlab and Wolfram Mathematica)

Maps are visualization for Humans

Geospatial data is uniquely suited to visualization.

Note: See TracWiki for help on using the wiki.