Planet SciPy

ListenData 2019-04-20 21:01:00

Loops in Python explained with examples

This tutorial covers various ways to execute loops in python. Loops is an important concept of any programming language which performs iterations i.e. run specific code repeatedly until a certain condition is reached.

Real Time Examples of Loop
  1. Software of the ATM machine is in a loop to process transaction after transaction until you acknowledge that you have no more to do.
  2. Software program in a mobile device allows user to unlock the mobile with 5 password attempts. After that it resets mobile device.
  3. You put your favorite song on a repeat mode. It is also a loop.
  4. You want to run a particular analysis on each column of your data set.

1. For Loop

Like R and C programming language, you can use for loop in Python. It is one of the most commonly used loop method to automate the repetitive tasks.

How for loop works?

Suppose you are asked to print sequence of numbers from 1 to 9, increment by 2.
for i in range(1,10,2):
range(1,10,2) means
Quansight Labs 2019-04-17 05:00:00

MOA: a theory for composable and verifiable tensor computations

Python-moa (mathematics of arrays) is an approach to a high level tensor compiler that is based on the work of Lenore Mullins and her dissertation. A high level compiler is necessary because there are many optimizations that a low level compiler such as gcc will miss. It is trying to solve many of the same problems as other technologies such as the taco compiler and the xla compiler. However, it takes a much different approach than others guided by the following principles.

  1. What is the shape? Everything has a shape. scalars, vectors, arrays, operations, and functions.
  2. What are the given indicies and operations required to produce a given index in the result?

Having a compiler that is guided upon these principles allows for high level reductions that other compilers will miss and allows for optimization of algorithms as a whole. Keep in mind that MOA is NOT a compiler. It is a theory that guides compiler development. Since python-moa is based on theory we get unique properties that other compilers cannot guarantee:

Read more…

Living in an Ivory Basement 2019-04-15 22:00:00

Some questions and thoughts on journal peer review.

What's up with current peer review practice?

ListenData 2019-04-14 15:31:00

Create Dummy Data in Python

This article explains various ways to create dummy or random data in Python for practice. Like R, we can create dummy data frames using pandas and numpy packages. Most of the analysts prepare data in MS Excel. Later they import it into Python to hone their data wrangling skills in Python. This is not an efficient approach. The efficient approach is to prepare random data in Python and use it later for data manipulation.

1. Enter Data Manually in Editor Window

The first step is to load pandas package and use DataFrame function
import pandas as pd
data = pd.DataFrame({"A" : ["John","Deep","Julia","Kate","Sandy"],
"MonthSales" : [25,30,35,40,45]})
       A  MonthSales
0 John 25
1 Deep 30
2 Julia
Anaconda 2019-04-11 16:24:17

The Human Element in AI

The over 45 speakers at AnacondaCON 2019 delved into how machine learning, artificial intelligence, enterprise, and open source communities are accomplishing great things with data — from optimizing urban farming to identifying the elements in…

The post The Human Element in AI appeared first on Anaconda.

Living in an Ivory Basement 2019-04-10 22:00:00

Things to think about when developing shotgun metagenome classifiers

Thoughts on goals and tradeoffs in classifying shotgun metagenome data.

ListenData 2019-04-09 18:47:00


The most common issue in installing python package in a company's network is failure of verification of SSL Certificate. Sometimes company blocks some websites in their network so employees can't access these websites. Whenever they try to visit these websites, it shows "Access Denied because of company's policy". It causes connection error in reaching main python website.

Error looks like this :

Could not fetch URL connection error: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:598)

PIP SSL Certification Issue

Solution :

Run the following command. Make sure to specify package name in <package_name>
pip install --trusted-host --trusted-host <package_name> -vvv
Suppose you want to install pandas package, you should submit the following line of command
pip install --trusted-host --trusted-host pandas -vvv

The --trusted-host option mark the host as trusted, even though it does not have valid or any HTTPS
About Author:

Deepanshu founded ListenData with a simple objective - Make analytics easy to understand and follow. He has over 7 years of experience in

Anaconda 2019-04-09 16:41:40

AnacondaCON 2019 Day 3 Recap: The Need for Speed, “Delightful UX” in Dev Tools, LOTR Jokes and More.

Everyone at Anaconda is still feeling the love AnacondaCON 2019. Day 3 wrapped up last Friday with one more day of talks and sessions, highlighted by some powerhouse keynotes. Let’s get right to the good…

The post AnacondaCON 2019 Day 3 Recap: The Need for Speed, “Delightful UX” in Dev Tools, LOTR Jokes and More. appeared first on Anaconda.

ListenData 2019-04-09 15:56:00

Install Python Package

Python is one of the most popular programming language for data science and analytics. It is widely used for a variety of tasks in startups and many multi-national organizations. The beauty of this programming language is that it is open-source which means it is available for free and has very active community of developers across the world. Python developers share their solutions in the form of package or module with other python users. This tutorial explains various ways how to install python package.

Ways to Install Python Package

Method 1 : If Anaconda is already installed on your System

Anaconda is the data science platform which comes with pre-installed popular python packages and powerful IDE (Spyder) which has user-friendly interface to ease writing of python programming scripts.

If Anaconda is installed on your system (laptop), click on Anaconda Prompt as shown in the image below.

Anaconda Prompt

To install a python package or module, enter the code below in Anaconda Prompt -
pip install package-name
Living in an Ivory Basement 2019-04-08 22:00:00

News from the NIH Data Commons Pilot Phase Consortium

The NIH Data Commons Pilot Phase Consortium is dead! (Long live the NIH Data Commons!)

Quansight Labs 2019-04-08 05:00:00

Thoughts on joining Quansight Labs

In his blog post welcoming me, Travis set out his vision for pushing forward the Python ecosystem for scientific computing and data science, and how to fund it. In this post I'll add my own perspectives to that. Given that Quansight Labs' purpose, it seems fitting to start with how I see things as a community member and organizer.

A community perspective

The SciPy and PyData ecosystems have experienced massive growth over the past years, and this is likely to continue in the near future. As a maintainer, that feels very gratifying. At the same time it brings up worries. Core projects struggle to keep up with the growth in number of users. Funded development can help with this, if done right. Some of the things I would like to see from companies that participate in the ecosystem:

  • Explain innovations they're working on to the community and solicit input, at an early stage. Developing something away from the spotlight and then unveiling
Living in an Ivory Basement 2019-04-07 22:00:00

Critically assessing open science - the CAOS meeting.

A summary of the CAOS open science meeting

Two Bit Arcade - python 2019-04-07 15:20:00

Etch-A-Snap — The Raspberry Pi powered Etch-A-Sketch camera

Etch-A-Snap is (probably) the worlds first Etch-A-Sketch Camera. Powered by a Raspberry Pi Zero (or Zero W) it snaps photos just like any other camera, but outputs them by drawing to an Pocket Etch-A-Sketch screen. Quite slowly.

Photos are processed down to 240x144 pixel 1-bit (black & white) line drawings using …

Anaconda 2019-04-05 18:25:01

Anaconda 2019.03 Release

Windows is the most popular operating system in the world and consistently has 75% or more of the worldwide desktop market. According to the JetBrains Python Developers Survey, 49% of Python developers use Windows as…

The post Anaconda 2019.03 Release appeared first on Anaconda.

Anaconda 2019-04-05 16:12:42

AnacondaCON 2019 Day 2 Recap: AI in Medicine, Cataloging the Contents of Stars, and More!

What You Missed at AnacondaCON Day 2 We’re back with a recap of Day 2 of our annual AnacondaCON. (In case you missed it, you can read our Day 1 recap here). Things started off…

The post AnacondaCON 2019 Day 2 Recap: AI in Medicine, Cataloging the Contents of Stars, and More! appeared first on Anaconda.

Anaconda 2019-04-04 17:38:11

AnacondaCON 2019 Day 1 Recap: Big-Time Learning

AnacondaCON 2019 is off to a great start. As in past years, we programmed Day 1 with product- and package-specific tutorials for those looking to get hands-on learning with Anaconda Enterprise tools. Spots in these…

The post AnacondaCON 2019 Day 1 Recap: Big-Time Learning appeared first on Anaconda.

Anaconda 2019-03-29 02:33:32

3 Ways to Upskill in Python with DataCamp and Anaconda

DataCamp is proud to partner with Anaconda to offer eight courses on Conda and Python—in addition to the more than 70 total Python courses in DataCamp’s ever-expanding data science and analytics curriculum. Not sure where…

The post 3 Ways to Upskill in Python with DataCamp and Anaconda appeared first on Anaconda.

Anaconda 2019-03-26 17:14:11

Anaconda Enterprise 5.3 Release

The Anaconda Product Team has released Anaconda Enterprise 5.3, an upgrade focused on platform stability and reliability to support our customers’ needs for continuous innovation. Customizable Static Endpoints Users can now customize a static endpoint,…

The post Anaconda Enterprise 5.3 Release appeared first on Anaconda.

Anaconda 2019-03-20 18:24:49

Announcing Public Anaconda Package Download Data

I’m very happy to announce that starting today, we will be publishing summarized download data for all conda packages served in the Anaconda Distribution, as well as the popular conda-forge and bioconda channels.  The dataset…

The post Announcing Public Anaconda Package Download Data appeared first on Anaconda.

While My MCMC Gently Samples 2019-03-15 14:00:00

Computational Psychiatry: Combining multiple levels of analysis to understand brain disorders - PhD thesis

I noticed that as my personal website at my former university went down that my PhD thesis could not be found anywhere, so I'm posting it here.

During my PhD I explored how machine learning and computational modeling of the brain can be used to improve our understanding, and diagnostics …

Python – Meta Rabbit 2019-03-12 12:00:51

NIXML: nix + YAML for easy reproducible environments

The rise and fall of bioconda A year ago, I remember a conversation which went basically like this: Them: So, to distribute my package, what do you think I should use? Me: You should use bioconda. Them: OK, that’s interesting, but what about …? Me: No, you should use bioconda. Them: I will definitely look … Continue reading NIXML: nix + YAML for easy reproducible environments
Anaconda 2019-03-11 16:17:46

Understanding and Improving Conda’s performance

Lately, we have been responding to issues about Conda’s speed.  We’re working on it and we wanted to explain a few of the facets that we’re looking at to solve the problem.   TL;DR: make…

The post Understanding and Improving Conda’s performance appeared first on Anaconda.

Anaconda 2019-03-11 15:58:34

End of Life (EOL) for Python 2.7 is coming. Are you ready?

End of Life (EOL) for Python 2.7 is coming. Are you ready? We all knew it was coming. Back in 2014 when Guido van Rossum, Python’s creator and principal author, made the announcement, January 1,…

The post End of Life (EOL) for Python 2.7 is coming. Are you ready? appeared first on Anaconda.

Living in an Ivory Basement 2019-03-01 23:00:00

Sustaining open source: thinking about communities of effort

Thinking about how to sustain open source.

Living in an Ivory Basement 2019-02-28 23:00:00

My recent reading re sustaining open communities

What has Titus been reading lately?

Filipe Saraiva's blog 2019-02-24 22:26:11

Reduzindo a pilha

Sou fã de quadrinhos desde criança. As primeiras revistas que ganhei foram na primeira metade dos anos 90, alguns Mickeys, Mônicas, Trapalhões e X-Men. Em 98 comecei a comprar X-Men, Fabulosos X-Men e Wolverine, até os primeiros números da famigerada X-Men Premium. Sem dinheiro, enveredei pelos mangás e histórias fechadas. Quando a Panini começa a... [Read More]
Living in an Ivory Basement 2019-02-21 23:00:00

Threat models for open online scientific engagement?

What threats are there for scientists in engaging in open online discussions?

Anaconda 2019-02-14 21:26:28

Intake released on Conda-Forge

Intake is a package for cataloging, finding and loading your data. It has been developed recently by Anaconda, Inc., and continues to gain new features. To read general information about Intake and how to use…

The post Intake released on Conda-Forge appeared first on Anaconda.

Announcement: Audio TK 3.1.0

ATK is updated to 3.1.0 with heavy code refactoring. Old C++ standards are now dropped and it requires now a full C++17 compliant compiler. The main difference for filter support is that explicit SIMD filters using libsimdpp have been dropped while tr2::simd becomes standard and supported by gcc, clang and Visual Studio. Download link: ATK […]
Anaconda 2019-01-30 21:55:23

RPM and Debian Repositories for Miniconda

Conda, the package manager from Anaconda, is now available as either a RedHat RPM or as a Debian package. The packages are the equivalent to the Miniconda installer which only contains Conda and its dependencies.…

The post RPM and Debian Repositories for Miniconda appeared first on Anaconda.

Filipe Saraiva's blog 2019-01-29 01:48:22


A voz feminina robótica (chegamos no tempo onde questão de gênero e robôs podem se confundir) soou, estranha e familiar como sempre, assim que o carro finalizou a curva para a direita: “Você entrou na Avenida Universitária; o limite de velocidade é 60 quilômetros por hora”. Meu pai sorriu e começou a falar: – Desde... [Read More]
While My MCMC Gently Samples 2019-01-21 15:00:00

My foreword to "Bayesian Analysis with Python, 2nd Edition" by Osvaldo Martin

When Osvaldo asked me to write the foreword to his new book I felt honored, excited, and a bit scared, so naturally I accepted. What follows is my best attempt to convey what makes probabilistic programming so exciting to me. Osvaldo did a great job with the book, it is …

Filipe Saraiva's blog 2019-01-21 14:39:48

Call for Answers: Survey About Task Assignment

Professor Igor Steinmacher, from Northern Arizona University, is a proeminent researcher on several social dynamics in open source communities, like support of newcomers, gender bias, open sourcing proprietary software, and more. Some of his papers can de found in his website. Currently, Prof. Igor is inviting mentors from open source communities to answer a survey... [Read More]
Living in an Ivory Basement 2019-01-15 23:00:00

Revisiting authorship, and JOSS software publications

The question du jour: how should authorship on software papers be decided?

While My MCMC Gently Samples 2019-01-14 15:00:00

Using Bayesian Decision Making to Optimize Supply Chains

(c) 2019 Thomas Wiecki & Ravin Kumar

As advocates of Bayesian statistics in data science we often have to convince business-minded colleagues or customers of the added value of such an approach. While there are many good reasons for applying Bayesian modeling to solve business problems (Sean J Taylor recently had …

Two Bit Arcade - python 2019-01-11 08:00:00

Gyroscopic 3D wireframe cube — Using a 3-axis gyro for live 3D perspective

This little project combines the previous accelerometer-gyroscope code with the 3D rotating OLED cube to produce a 3D cube which responds to gyro input, making it possible to "peek around" the cube with simulated perspective, or make it spin with a flick of the wrist.

Take a look at those …

Filipe Saraiva's blog 2019-01-08 02:21:02

Mestrado em Ciência da Computação na UFPA 2019: Inteligência Computacional para Smart Grids; Metaheurísticas

Está aberto o processo seletivo para o mestrado em ciência da computação do PPGCC-UFPA. Nesse certame, estou disponibilizando 2 vagas para alunos que desenvolverão seus trabalhos junto aos demais pesquisadores no LAAI. As vagas são voltadas para os temas de inteligência computacional aplicada a Smart Grids e estudos sobre métodos metaheurísticos de otimização. Gostaria de... [Read More]
Filipe Saraiva's blog 2019-01-05 17:43:33

LaKademy 2018

Em outubro de 2018, Florianópolis foi sede da sexta edição do LaKademy, o sprint latinoamericano do KDE. Esse momento é uma oportunidade para termos em um mesmo lugar vários desenvolvedores do KDE – tanto veteranos quanto novatos – de diferentes projetos para melhorarem os respectivos softwares em que trabalham e também planejar as ações de... [Read More]
Filipe Saraiva's blog 2019-01-05 16:59:19

LaKademy 2018

Past October 2018, Florianópolis hosted the 6th edition of LaKademy, the Latin-American KDE sprint. That moment is an opportunity to put together several KDE developers – both veterans and newcomers – from different projects in order to work for improve their respective software and plan the promotional actions of the community in the subcontinent. In... [Read More]

GPU Dask Arrays, first steps

The following code creates and manipulates 2 TB of randomly generated data.

import dask.array as da

rs = da.random.RandomState()
x = rs.normal(10, 1, size=(500000, 500000), chunks=(10000, 10000))
(x + 1)[::2, ::2].sum().compute(scheduler='threads')

On a single CPU, this computation takes two hours.

On an eight-GPU single-node system this computation takes nineteen seconds.

Combine Dask Array with CuPy

Actually this computation isn’t that impressive. It’s a simple workload, for which most of the time is spent creating and destroying random data. The computation and communication patterns are simple, reflecting the simplicity commonly found in data processing workloads.

What is impressive is that we were able to create a distributed parallel GPU array quickly by composing these three existing libraries:

  1. CuPy provides a partial implementation of Numpy on the GPU.

  2. Dask Array provides chunked algorithms on top of Numpy-like libraries like Numpy and CuPy.

    This enables us to operate on more data than we could fit in memory by operating on that data in

Two Bit Arcade - python 2019-01-01 08:00:00

3-axis Accelerometer-Gyro — Measuring acceleration and orientation with an MPU6050

Measuring acceleration and rotation has a lot of useful applications, from drone or rocket stablisation to making physically interactive handheld games.

An accelerometer measures proper acceleration, meaning the rate of change of velocity relative to it's own rest frame. This is in contrast to coordinate acceleration, which is relative to …

Leonardo Uieda 2018-12-26 12:00:00

Manage project dependencies with conda environments

TL;DR: Create a conda environment for each project, capture exact versions when possible, automate activation and updating with a bash function.

I often work on several different projects involving software: Python libraries, papers, presentations, posters, this website, etc. Each project has different dependencies and there is a non-zero chance that these dependencies might be in conflict with each other. For example, I need Python 2.7 to work on a tesseroid modeling paper with a student, while my current work on


First Impressions of GPUs and PyData

I recently moved from Anaconda to NVIDIA within the RAPIDS team, which is building a PyData-friendly GPU-enabled data science stack. For my first week I explored some of the current challenges of working with GPUs in the PyData ecosystem. This post shares my first impressions and also outlines plans for near-term work.

First, lets start with the value proposition of GPUs, significant speed increases over traditional CPUs.

GPU Performance

Like many PyData developers, I’m loosely aware that GPUs are sometimes fast, but don’t deal with them often enough to have strong feeling about them.

To get a more visceral feel for the performance differences, I logged into a GPU machine, opened up CuPy (a Numpy-like GPU library developed mostly by Chainer in Japan) and cuDF (a Pandas-like library in development at NVIDIA) and did a couple of small speed comparisons:

Compare Numpy and Cupy
>>> import numpy, cupy

>>> x = numpy.random.random((10000, 10000))
>>> y = cupy.random.random((10000, 10000))

>>> %timeit bool((numpy.sin(x) ** 2 + numpy.cos(x) ** 2 == 1).all())
446 ms ± 53.1 ms per
Living in an Ivory Basement 2018-12-07 23:00:00

A quick read of _The genomic and proteomic landscape of the rumen microbiome_

Using short and long reads to assemble genomes from metagenomes!

Support Python 2 with Cython


Many popular Python packages are dropping support for Python 2 next month. This will be painful for several large institutions. Cython can provide a temporary fix by letting us compile a Python 3 codebase into something usable by Python 2 in many cases.

It’s not clear if we should do this, but it’s an interesting and little known feature of Cython.

Background: Dropping Python 2 Might be Harder than we Expect

Many major numeric Python packages are dropping support for Python 2 at the end of this year. This includes packages like Numpy, Pandas, and Scikit-Learn. Jupyter already dropped Python 2 earlier this year.

For most developers in the ecosystem this isn’t a problem. Most of our packages are Python-3 compatible and we’ve learned how to switch libraries. However, for larger companies or government organizations it’s often far harder to switch. The PyCon 2017 keynote by Lisa Guo and Hui Ding from Instagram gives a good look into why this can be challenging for large production codebases and also gives a good


Anatomy of an OSS Institutional Visit

I recently visited the UK Meteorology Office, a moderately large organization that serves the weather and climate forecasting needs of the UK (and several other nations). I was there with other open source colleagues including Joe Hamman and Ryan May from open source projects like Dask, Xarray, JupyterHub, MetPy, Cartopy, and the broader Pangeo community.

This visit was like many other visits I’ve had over the years that are centered around showing open source tooling to large institutions, so I thought I’d write about it in hopes that it helps other people in this situation in the future.

My goals for these visits are the following:

  1. Teach the institution about software projects and approaches that may help them to have a more positive impact on the world
  2. Engage them in those software projects and hopefully spread around the maintenance and feature development burden a bit
Step 1: Meet allies on the ground

We were invited by early adopters within the institution, both within the UK Met Office’s Informatics Lab

(continued...) 2018-11-16 23:00:00

Notes on the Frank-Wolfe Algorithm, Part II: A Primal-dual Analysis

This blog post extends the convergence theory from the first part of my notes on the Frank-Wolfe (FW) algorithm with convergence guarantees on the primal-dual gap which generalize and strengthen the convergence guarantees obtained in the first part.

MathJax.Hub.Config({ extensions: ["tex2jax.js"], jax: ["input/TeX", "output/HTML-CSS"], tex2jax …
Living in an Ivory Basement 2018-11-11 23:00:00

Creating a welcoming teaching/learning environment in workshops

It takes constant work to make a welcoming teaching/learning environment!

Living in an Ivory Basement 2018-11-08 23:00:00

Repeatability in Practice (2018 version)

How we do repeatability in the DIB Lab

Stéfan van der Walt - python 2018-10-31 07:00:00

Linking to emails in org-mode (using neomutt)

Where we store links to emails in org-mode, and open them using neomutt.

Filipe Saraiva's blog 2018-10-29 15:40:06

Ode ao ódio

Ontem, acompanhando a apuração para presidente no 2º turno, chorei. Chorei de raiva. Chorei de ódio. Ódio porque aquele que levou o pleito representa uma total afronta ao mínimo do que chamamos civilidade. Ele defendeu a ditadura e a tortura, reiteradamente. Prometeu prender ou exilar opositores. Prometeu perseguir professores, artistas, a intelectualidade. Disse que irá... [Read More]
Filipe Saraiva's blog 2018-10-28 14:08:15

Eleições 2018: Minha carta para a família

Família, essa é minha última manifestação política aqui no grupo antes do resultado. Vocês me conhecem, sou professor de ciência da computação na UFPA, sou um dos responsáveis pela formação dos próximos engenheiros de software e matemáticos computacionais da nossa região. Oriento alunos na graduação, no mestrado e também no doutorado, mesmo com todas as... [Read More]
Filipe Saraiva's blog 2018-10-17 16:19:00

A arquitetura de compartilhamentos do Telegram para mitigar as fake news no WhatsApp

Fake News já se tornaram o tipo de problema que teremos que enfrentar de alguma maneira o quanto antes, ou veremos democracias sendo destruídas uma a uma. Se o caso Trump nos chamava atenção mas ainda parecia distante, as eleições brasileiras de 2018 vieram pra mostrar que o tiozão gente boa pode se converter no... [Read More]

So you want to contribute to open source

Welcome new open source contributor!

I appreciated receiving the e-mail where you said you were excited about getting into open source and were particularly interested in working on a project that I maintain. This post has a few thoughts on the topic.

First, please forgive me for sending you to this post rather than responding with a personal e-mail. Your situation is common today, so I thought I’d write up thoughts in a public place, rather than respond personally.

This post has two parts:

  1. Some pragmatic steps on how to get started
  2. A personal recommendation to think twice about where you focus your time
Look for good first issues on Github

Most open source software (OSS) projects have a “Good first issue” label on their Github issue tracker. Here is a screenshot of how to find the “good first issue” label on the Pandas project:

(note that this may be named something else like “Easy to fix”)

This contains a list of issues that are important, but also

Filipe Saraiva's blog 2018-09-28 04:53:50

Ciro em frente!

Faltando poucos dias para o 1º turno das eleições, aproveito o momento para declarar meu voto em Ciro Gomes e convido amigos e amigas a ponderarem e também votarem no candidato. Em um conceito bastante generoso de partidos políticos, tratam-se de organizações estruturadas em torno de uma ideia de ordenamento social e que tentam, através... [Read More]

Announcement: Audio TK 3.0.0

ATK is updated to 3.0.0 with a major ABI break and code quality improvement (see here). Bugs in different areas were fixed. Development for additional modules was also simplified (the modelling lite is such a project based on Audio Toolkit). Download link: ATK 3.0.0 Changelog: 3.0.0 * Change size for gsl::index everywhere (change of ABI) […]
Two Bit Arcade - python 2018-09-23 07:00:00

3D wireframe cube with MicroPython — Basic 3D model rotation and projection

An ESP2866 is never going to compete with an actual graphics card. But it has more than enough oomph to explore the fundamentals of 3D graphics. In this short tutorial we'll go through the basics of creating a 3D scene and displaying it on an OLED screen using MicroPython.

This …

Spyder Blog 2018-09-21 00:00:00

QtConsole 4.4 Released!

We're excited to announce a significant update to QtConsole—the package that powers Spyder's IPython Console interface—which the Spyder team maintains in collaboration with Project Jupyter. Two of the biggest changes—user-selectable syntax highlighting themes, and enhanced external editor/IDE integration—are already built right into Spyder, so they'll likely be of more interest if you use QtConsole standalone or with another editor/IDE. However, most of the other changes should prove quite useful within Spyder as well, and many were in fact suggested and even implemented by users of our IDE. Particular highlights include a block indent/unindent feature, Select-All (Ctrl-Shift-A) being made cell-specific, Ctrl-Backspace and Ctrl-Delete behaving more intelligently across whitespace and line boundaries, Ctrl-D allowing you to easily exit ipdb, input() and the like, and numerous smaller enhancements and bug fixes. If you'd like to learn more about what's new, please check out our article over on the Jupyter blog, where we go over the major changes in more detail, with plenty


Dask Development Log

This work is supported by Anaconda Inc

To increase transparency I’m trying to blog more often about the current work going on around Dask and related projects. Nothing here is ready for production. This blogpost is written in haste, so refined polish should not be expected.

Since the last update in the 0.19.0 release blogpost two weeks ago we’ve seen activity in the following areas:

  1. Update Dask examples to use JupyterLab on Binder
  2. Render Dask examples into static HTML pages for easier viewing
  3. Consolidate and unify disparate documentation
  4. Retire the hdfs3 library in favor of the solution in Apache Arrow.
  5. Continue work on hyper-parameter selection for incrementally trained models
  6. Publish two small bugfix releases
  7. Blogpost from the Pangeo community about combining Binder with Dask
  8. Skein/Yarn Update
1: Update Dask Examples to use JupyterLab extension

The new dask-labextension embeds Dask’s dashboard plots into a JupyterLab session so that you can get easy access to information

Gaël Varoquaux - programming 2018-09-16 22:00:00

A foundation for scikit-learn at Inria

We have just announced that a foundation will be supporting scikit-learn at Inria [1]:

Growth and sustainability

This is an exciting turn for us, because it enables us to receive private funding. As a result, we will be able to have secure employment for some existing core …

Leonardo Uieda 2018-09-14 12:00:00

Introducing Verde

Verde is a Python library for processing spatial data (bathymetry, geophysics surveys, etc) and interpolating it on regular grids (i.e., gridding).

It implements Green's functions based interpolation methods and other data processing routines. The type of gridding implemented in Verde is essentially fitting various linear models to spatial data and using them to predict new data on regular grids, which is what a lot of machine learning is all about. So Verde's gridder API is inspired on scikit-learn, the state-of-the-art for machine learning in Python. The Green's functions that make up the Jacobian matrix (aka sensitivity or feature matrix) of the linear models generally come from elastic deformation theory. For example, the bi-harmonic spline (Sandwell, 1987) implemented in verde.Spline comes from the deformation of a thin elastic plate.

I submitted a

Pythonic Perambulations 2018-09-13 17:00:00

The Waiting Time Paradox, or, Why Is My Bus Always Late?

Image Source: Wikipedia License CC-BY-SA 3.0

If you, like me, frequently commute via public transit, you may be familiar with the following situation:

You arrive at the bus stop, ready to catch your bus: a line that advertises arrivals every 10 minutes. You glance at your watch and note the time... and when the bus finally comes 11 minutes later, you wonder why you always seem to be so unlucky.

Naïvely, you might expect that if buses are coming every 10 minutes and you arrive at a random time, your average wait would be something like 5 minutes. In reality, though, buses do not arrive exactly on schedule, and so you might wait longer. It turns out that under some reasonable assumptions, you can reach a startling conclusion:

When waiting for a bus that comes on average every 10 minutes, your average waiting time will be 10 minutes.

This is what is sometimes known as the waiting time paradox.

I've encountered this idea before, and always wondered

(continued...) 2018-09-05 22:00:00

Three Operator Splitting

I discuss a recently proposed optimization algorithm: the Davis-Yin three operator splitting.

Dask Release 0.19.0

This work is supported by Anaconda Inc.

I’m pleased to announce the release of Dask version 0.19.0. This is a major release with bug fixes and new features. The last release was 0.18.2 on July 23rd. This blogpost outlines notable changes since the last release blogpost for 0.18.0 on June 14th.

You can conda install Dask:

conda install dask

or pip install from PyPI:

pip install dask[complete] --upgrade

Full changelogs are available here:

Notable Changes

A ton of work has happened over the past two months, but most of the changes are small and diffuse. Stability, feature parity with upstream libraries (like Numpy and Pandas), and performance have all significantly improved, but in ways that are difficult to condense into blogpost form.

That being said, here are a few of the more exciting changes in the new release.

Python Versions

We’ve dropped official support for Python 3.4 and added official support for Python 3.7.

Deploy on Hadoop Clusters

Over the past few months Jim Crist has bulit a suite of


Book: Building Machine Learning Systems with Python – third edition

A few year ago, Packt Publishing contacted to be a technical reviewer for the first edition of Building Machine Learning Systems with Python, and I was impressed by the writing of Luis Pedro Coelho and Willi Richert. For the second edition, I was again a technical reviewer. Writing is not easy, especially when it’s not […]
Planet SciPy – I Love Symposia! 2018-08-30 04:48:05

Summer school announcement: 2nd Advanced Scientific Programming in Python (ASPP) Asia Pacific!

The Advanced Scientific Programming in Python (ASPP) summer school has had 10 successful iterations in Europe and one iteration here in Melbourne earlier this year. Another European iteration is starting next week in Camerino, Italy. Now, thanks to the generous sponsorship of CSIRO, and the efforts of Benjamin Schwessinger and Genevieve Buckley, two alumni from … Continue reading Summer school announcement: 2nd Advanced Scientific Programming in Python (ASPP) Asia Pacific!
Living in an Ivory Basement 2018-08-28 22:00:00

Abstract for SIAM: Supporting and Sustaining Open Source Software Development: the Commons Perspective

How do we support and sustain open source software development?

Analog modelling: The Moog ladder filter emulation in Python

After my previous post on SPICE modelling in Python, I need to use a good support example to go up to on the fly compilation in C++. This schema will also require some changes to support more than simple nodal analysis, so this now becomes Modified Nodal Analysis with state equations. The simple model I […]

High level performance of Pandas, Dask, Spark, and Arrow

This work is supported by Anaconda Inc


How does Dask dataframe performance compare to Pandas? Also, what about Spark dataframes and what about Arrow? How do they compare?

I get this question every few weeks. This post is to avoid repetition.

  1. This answer is likely to change over time. I’m writing this in August 2018
  2. This question and answer are very high level. More technical answers are possible, but not contained here.
Answers Pandas

If you’re coming from Python and have smallish datasets then Pandas is the right choice. It’s usable, widely understood, efficient, and well maintained.

Benefits of Parallelism

The performance benefit (or drawback) of using a parallel dataframe like Dask dataframes or Spark dataframes over Pandas will differ based on the kinds of computations you do:

  1. If you’re doing small computations then Pandas is always the right choice. The administrative costs of parallelizing will outweigh any benefit. You should not parallelize if your computations are taking less

Two Bit Arcade - python 2018-08-27 16:00:00

Displaying images on OLED screens — Using 1-bpp images in MicroPython

We've previously covered the basics of driving OLED I2C displays from MicroPython, including simple graphics commands and text. Here we look at displaying monochrome 1 bit-per-pixel images and animations using MicroPython on a Wemos D1.

Processing the images and correct choice of image-formats is important to get the most detail …

Two Bit Arcade - python 2018-08-25 08:00:00

Driving I2C OLED displays with MicroPython — I2C monochrome displays with SSD1306

These mini monochrome OLED screens make great displays for projects — perfect for data readout, simple UIs or monochrome games.

Wemos D1 v2.2+ or good imitations. Buy
0.91in OLED Screen 128x32 pixels, I2c interface. Buy
Breadboard Any size will do. Buy
Wires Loose ends, or jumper leads.
Setting …
Two Bit Arcade - python 2018-08-23 19:00:00

Raindar — Desktop daily weather, forecast app in PyQt

The Raindar UI was created using Qt Designer, and saved as .ui file, which is available for download. This was converted to an importable Python file using pyuic5.

API key

Before running the application you need to obtain a API key from This key is unique to you …

Public Institutions and Open Source Software

As general purpose open source software displaces domain-specific all-in-one solutions, many institutions are re-assessing how they build and maintain software to support their users. This is true across for-profit enterprises, government agencies, universities, and home-grown communities.

While this shift brings opportunities for growth and efficiency, it also raises questions and challenges about how these institutions should best serve their communities as they grow increasingly dependent on software developed and controlled outside of their organization.

  • How do they ensure that this software will persist for many years?
  • How do they influence this software to better serve the needs of their users?
  • How do they transition users from previous all-in-one solutions to a new open source platform?
  • How do they continue to employ their existing employees who have historically maintained software in this field?
  • If they have a mandate to support this field, what is the best role for them to play, and how can they justify their efforts to the groups that control their budget?

This blogpost

Living in an Ivory Basement 2018-08-17 22:00:00

Can bits be the basis for a digital commons? (No.)

Bits cannot be the basis for a digital commons, because they are not rivalrous.

Spyder Blog 2018-08-14 00:00:00

Spyder 3.3.0 and 3.3.1 released!

We're pleased to release the next significant update in the stable Spyder 3 line, 3.3.0, along with its follow-on bugfix point release, 3.3.1, which is now live on PyPI and conda. As always, you can update with conda update spyder in the Anaconda Prompt/Terminal/command line (on Windows/macOS/Linux, respectively) if on Anaconda (recommended), or pip update spyder otherwise. If you run into any trouble, please carefully read our new installation documentation and consult our Troubleshooting Guide, which contains straightforward solutions to the vast majority of install-related issues users have reported.

As a new minor version (3.3), it makes several substantial changes to Spyder's underpinnings that deserve some explanation, particularly the newly modular and portable console system that's now separated into its own spyder-kernels package, opening up several new options for users running Spyder in different environments. There's also a brand-new error reporting process, new options in the IPython console, usability and performance improvements for the Variable Explorer, multiple new and changed dependency requirements

While My MCMC Gently Samples 2018-08-13 14:00:00

Hierarchical Bayesian Neural Networks with Informative Priors

(c) 2018 by Thomas Wiecki

Imagine you have a machine learning (ML) problem but only small data (gasp, yes, this does exist). This often happens when your data set is nested -- you might have many data points, but only few per category. For example, in ad-tech you may want predict …

Spyder Blog 2018-08-13 00:00:00

Spyder featured on Episode 1 of Open Source Directions web show

Quansight, the company recently founded by NumPy, SciPy and Anaconda creator Travis Oliphant to help connect companies with open source communities built around data science and machine learning, just released Episode 1 of its live webcast series, and it was all about Spyder! Spyder maintainer Carlos Córdoba, recently hired by Quansight and funded part-time to work on Spyder development as we announced a few weeks ago, was the featured guest on the show.

Carlos first shared his perspective on some of the key moments in Spyder's nearly 10-year development history, from its original creation by Pierre Raybaut and Carlos' initial involvement in the project to its more recent challenges and successes. He also demonstrated basic usage of Spyder, as well as some of its standout features, in a live on-screen demo. Carlos then went on to outline the current roadmap for Spyder 4 in the near future, and explained some of the key new features planned for it. Finally, he took

Neural Ensemble News 2018-08-10 17:48:00

NeuroML2/LEMS is moving into Neural Mass Models and whole brain networks

In the last months, as part of the Google Summer of Code 2018, I have been working on a project that aimed to implement neuronal models which represent averaged population activity on NeuroML2/LEMS. The project was supported by the INCF organisation and my mentor, Padraig Gleeson, and I had 3 months to shape and bring to life all the ideas that we had in our heads. This blog post summarises the core motivation of the project, the technical challenges, what I have done, and future steps.

NeuroML version 2 and LEMS were introduced in order to standardise the description of neuroscience computational models and facilitate the shareability of results among different research groups1. However, so far, NeuroML2/LEMS have focused on modelling spiking neurons and how information is exchanged between them in networks. With the introduction of neural mass models, NeuroML2/LEMS can be extended to study interactions between large-scale systems such as cortical regions and indeed whole brain dynamics. To achieve this,
Living in an Ivory Basement 2018-08-09 22:00:00

"Labor" and "Engaged effort"

Are "effort" and "labor" the same?

Gaël Varoquaux - programming 2018-07-31 22:00:00

Sprint on scikit-learn, in Paris and Austin

Two weeks ago, we held a scikit-learn sprint in Austin and Paris. Here is a brief report, on progresses and challenges.

Several sprints

We actually held two sprint in Austin: one open sprint, at the scipy conference sprints, which was open to new contributors, and one core sprint, for more …

Leonardo Uieda 2018-07-26 12:00:00

Websites for Earth Scientists on the academic job hunt

This is a list of the websites I use to search for academic jobs in the Earth Sciences (geophysics, geology, oceanography, meteorology, etc). They've been very useful to me (I found my current position through the CIG mailing list) and I hope that this post can help others who are looking to take the next step in their academic careers.

These sites list everything from Masters and PhD scholarships to postdoc positions and tenure-track professorships. Note that they are biased toward the US, Canada, Oceania, and Europe.

Mailing lists

Sign up for these and get email updates when new opportunities are posted (most are updated daily):

  • ES_JOBS_NET: I get around 10 emails from this list a day. Lately, I'm seeing a lot of
Spyder Blog 2018-07-23 00:00:00

State of the Spyder, Part 2: Looking up

After sharing some major milestones, development progress, and other tidbits from the past six months in Part 1 of this series (check that one out first if you haven't already), we now have some amazing news to share with you all here in Part 2, along with other status updates. That's not all, though—Part 3 will look ahead toward Spyder 4 and beyond, unveiling and explaining our full roadmap and going over the future possibilities even further afield.

Spyder Wins NumFOCUS Development Grant

First up, we're thrilled to announce a major part of what's making that plan possible (along with your support, of course!). This May, Spyder was awarded a $3000 development grant from NumFOCUS, an organization promoting better science through open code, to help with finishing Spyder 4! NumFOCUS is a nonprofit dedicated to supporting key scientific computing projects; promoting sustainability in the open source ecosystem; educating the next generation of scientists, engineers, developers and data analysts through their flagship

Leonardo Uieda 2018-07-20 12:00:00

Introducing Pooch

A friend to fetch your sample data files.

Pooch is a Python package that manages downloading data files over HTTP and storing them in a local directory. It is meant to be used by other Python libraries that ship sample data files for use in documentation, workshops, demos, etc.

For example, your package could define a module that has functions to load sample data (like scikit-learn does). If you want the data to live on the web (like in the Github repo) instead of shipping it with your package, Pooch can keep track of it and download it to the user's computer only when it's needed.

This is what a module would look like using Pooch:

Module mypackage/
import pooch

# Get the version string from your project. You have one of these, right?
Planet SciPy – I Love Symposia! 2018-07-12 18:58:35

The road to scikit-image 1.0

This is the first in a series of posts about the joint scikit-image, scikit-learn, and dask sprint that took place at the Berkeley Insitute of Data Science, May 28-Jun 1, 2018. In addition to the dask and scikit-learn teams, the sprint brought together three core developers of scikit-image (Emmanuelle Gouillart, Stéfan van der Walt, and … Continue reading The road to scikit-image 1.0
Living in an Ivory Basement 2018-07-08 22:00:00

The Open Source Anti-Sisyphean League

We need an Open Source Anti-Sisyphean League!

python – Dr. Randal S. Olson 2018-07-04 20:41:32

Does batting order matter in Major League Baseball? A simulation approach

If you’ve ever watched Major League Baseball, one of the feature points of the sport is the batting line-up that each team decides upon before each game. Traditional baseball logic tells us that speedy, reliable hitters like Trea Turner should