Planet SciPy

Matthieu Brucher's blog 2018-12-11 08:29:30

Fun review: the Lego Rough Terrain Crane

I started my Lego adult path with the Mk2 crane, and now Lego has a new crane. This one is bigger, meaner, in some aspects, but hopefully better as well. Bigger wheels, but half of them, red instead of yellow, broader, and double crane boon instead of a triple one, so a different set of […]
Living in an Ivory Basement 2018-12-07 23:00:00

A quick read of _The genomic and proteomic landscape of the rumen microbiome_

Using short and long reads to assemble genomes from metagenomes!

Anaconda 2018-12-07 00:45:20

Intake for Cataloging Spark

By: Martin Durant Intake is an open source project for providing easy pythonic access to a wide variety of data formats, and a simple cataloging system for these data sources. Intake is a new project, and all are encouraged to try and comment on it. pySpark is the python interface to Apache Spark, a fast …
Read more →

The post Intake for Cataloging Spark appeared first on Anaconda.

Anaconda 2018-12-04 22:09:12

Using Pip in a Conda Environment

Unfortunately, issues can arise when conda and pip are used together to create an environment, especially when the tools are used back-to-back multiple times, establishing a state that can be hard to reproduce. Most of these issues stem from that fact that conda, like other package managers, has limited abilities to control packages it did …
Read more →

The post Using Pip in a Conda Environment appeared first on Anaconda.

Matthieu Brucher's blog 2018-12-04 08:25:10

Book review: Introduction to Electrical Circuit Analysis

A few weeks ago, I presented my work on automatic code generation from an electronic schema. I have many things to talk about this subject, one of them is this book. When you start analyzing a circuit, it is important to learn how to analyze a circuit. There are lots of books on electronics, but […]
Anaconda 2018-12-03 16:57:44

Python Data Visualization 2018: Moving Toward Convergence

By James A. Bednar This post is the second in a three-part series on the current state of Python data visualization and the trends that emerged from SciPy 2018. In my previous post, I provided an overview of the myriad Python data visualization tools currently available, how they relate to each other, and their many …
Read more →

The post Python Data Visualization 2018: Moving Toward Convergence appeared first on Anaconda.

Anaconda 2018-11-28 16:56:24

Understanding Conda and Pip

Conda and pip are often considered as being nearly identical. Although some of the functionality of these two tools overlap, they were designed and should be used for different purposes. Pip is the Python Packaging Authority’s recommended tool for installing packages from the Python Package Index, PyPI. Pip installs Python software packaged as wheels or …
Read more →

The post Understanding Conda and Pip appeared first on Anaconda.

Support Python 2 with Cython


Many popular Python packages are dropping support for Python 2 next month. This will be painful for several large institutions. Cython can provide a temporary fix by letting us compile a Python 3 codebase into something usable by Python 2 in many cases.

It’s not clear if we should do this, but it’s an interesting and little known feature of Cython.

Background: Dropping Python 2 Might be Harder than we Expect

Many major numeric Python packages are dropping support for Python 2 at the end of this year. This includes packages like Numpy, Pandas, and Scikit-Learn. Jupyter already dropped Python 2 earlier this year.

For most developers in the ecosystem this isn’t a problem. Most of our packages are Python-3 compatible and we’ve learned how to switch libraries. However, for larger companies or government organizations it’s often far harder to switch. The PyCon 2017 keynote by Lisa Guo and Hui Ding from Instagram gives a good look into why this can be challenging for large production codebases and also gives a good

Matthieu Brucher's blog 2018-11-27 08:58:04

Monitoring embedded space quality for classification

A few weeks ago, on StackOverflow, a user asked for an accuracy measure on the embedded space for an autoencoder. This was with Keras, but I thought it would be a nice exercise for Tensorflow as well. The idea in this case is to add a few layers to the embedded space to create a […]

Anatomy of an OSS Institutional Visit

I recently visited the UK Meteorology Office, a moderately large organization that serves the weather and climate forecasting needs of the UK (and several other nations). I was there with other open source colleagues including Joe Hamman and Ryan May from open source projects like Dask, Xarray, JupyterHub, MetPy, Cartopy, and the broader Pangeo community.

This visit was like many other visits I’ve had over the years that are centered around showing open source tooling to large institutions, so I thought I’d write about it in hopes that it helps other people in this situation in the future.

My goals for these visits are the following:

  1. Teach the institution about software projects and approaches that may help them to have a more positive impact on the world
  2. Engage them in those software projects and hopefully spread around the maintenance and feature development burden a bit
Step 1: Meet allies on the ground

We were invited by early adopters within the institution, both within the UK Met Office’s Informatics Lab

Anaconda 2018-11-21 20:23:28

Deriving Business Value from Data Science Deployments

By Gus Cavanaugh One of the biggest challenges facing organizations trying to derive value from data science and machine learning is deployment. In this post, we’ll take a look at three common approaches to deploying data science projects, and how Anaconda Enterprise simplifies deployment and allows data scientists to focus on building better models that …
Read more →

The post Deriving Business Value from Data Science Deployments appeared first on Anaconda.

Matthieu Brucher's blog 2018-11-20 08:45:13

From netlist to code: strategies to implement schematics modelling

Today, I’m presenting at the ADC my work on analog modelling for the past year. I will make a more detailed post later this year, but I’d like to put some teasers here. SPICE net lists are an efficient way of representing electronics circuits and there are several very good free and paying simulators. Unfortunately, […]
Anaconda 2018-11-15 19:16:04

Python Data Visualization 2018: Why So Many Libraries?

This post is the first in a three-part series on the state of Python data visualization tools and the trends that emerged from SciPy 2018. By James A. Bednar At a special session of SciPy 2018 in Austin, representatives of a wide range of open-source Python visualization tools shared their visions for the future of …
Read more →

The post Python Data Visualization 2018: Why So Many Libraries? appeared first on Anaconda. 2018-11-16 23:00:00

Notes on the Frank-Wolfe Algorithm, Part II: A Primal-dual Analysis

This blog post extends the convergence theory from the first part of my notes on the Frank-Wolfe (FW) algorithm with convergence guarantees on the primal-dual gap which generalize and strengthen the convergence guarantees obtained in the first part.

MathJax.Hub.Config({ extensions: ["tex2jax.js"], jax: ["input/TeX", "output/HTML-CSS"], tex2jax …
Matthieu Brucher's blog 2018-11-12 23:31:43

Fun review: the Lego Bugatti Chiron

This year, Lego published a set based on the Bugatti Chiron, one of the craziest cars, and built near my home town. It’s the second set in the Technic car collection series, and contrary to the Porsche, the color is inspired by a gorgeous real life car (I don’t think that the real Porsche exists…). […]
Living in an Ivory Basement 2018-11-11 23:00:00

Creating a welcoming teaching/learning environment in workshops

It takes constant work to make a welcoming teaching/learning environment!

Anaconda 2018-11-09 15:57:58

Choose Your Anaconda IDE Adventure: Jupyter, JupyterLab, or Apache Zeppelin

As humans we are faced with multiple choices every day. Every person is different: some people prefer Firefox while others like Chrome; some people prefer Python while others like R. Here at Anaconda, we abstain from engaging in language or IDE wars, and firmly believe our users shouldn’t have to compromise their preferences. That’s why …
Read more →

The post Choose Your Anaconda IDE Adventure: Jupyter, JupyterLab, or Apache Zeppelin appeared first on Anaconda.

Living in an Ivory Basement 2018-11-08 23:00:00

Repeatability in Practice (2018 version)

How we do repeatability in the DIB Lab

Anaconda 2018-10-31 19:26:34

Who You Gonna Call? Halloween Tips & Treats to Protect You from Ghosts, Gremlins…and Software Vulnerabilities

By Michael Sarahan Happy Halloween, readers. At Anaconda, we’re not too scared about things that go bump in the night. We’ve examined the data and concluded that it’s just the cleaning staff upstairs. We are, however, kept awake by the ever-present concern of the security and experience of our users! We’d like to take this …
Read more →

The post Who You Gonna Call? Halloween Tips & Treats to Protect You from Ghosts, Gremlins…and Software Vulnerabilities appeared first on Anaconda.

Stéfan van der Walt - python 2018-10-31 07:00:00

Linking to emails in org-mode (using neomutt)

Where we store links to emails in org-mode, and open them using neomutt.

Anaconda 2018-10-30 19:42:41

Open Source Model Management Roundup: Polyaxon, Argo, and Seldon

By Daniel Rodriguez One of the most common questions the Anaconda Enterprise team receives is something along the lines of: “But really, how difficult is it to build this using open source tools?” This is certainly a fair question, as open source does provide a lot of functionality while offering a lower entry price than …
Read more →

The post Open Source Model Management Roundup: Polyaxon, Argo, and Seldon appeared first on Anaconda.

Filipe Saraiva's blog 2018-10-29 15:40:06

Ode ao ódio

Ontem, acompanhando a apuração para presidente no 2º turno, chorei. Chorei de raiva. Chorei de ódio. Ódio porque aquele que levou o pleito representa uma total afronta ao mínimo do que chamamos civilidade. Ele defendeu a ditadura e a tortura, reiteradamente. Prometeu prender ou exilar opositores. Prometeu perseguir professores, artistas, a intelectualidade. Disse que irá... [Read More]
Filipe Saraiva's blog 2018-10-28 14:08:15

Eleições 2018: Minha carta para a família

Família, essa é minha última manifestação política aqui no grupo antes do resultado. Vocês me conhecem, sou professor de ciência da computação na UFPA, sou um dos responsáveis pela formação dos próximos engenheiros de software e matemáticos computacionais da nossa região. Oriento alunos na graduação, no mestrado e também no doutorado, mesmo com todas as... [Read More]
Anaconda 2018-10-23 21:15:50

Patching Source Code to Conda Build Recipes

By Casey Clements and Michael Sarahan If you are a developer who relies upon conda, we hope to encourage you to begin building your own packages so that your projects can be used just like all of the other packages you rely upon. The success of Anaconda rests upon the ease to search for, install, …
Read more →

The post Patching Source Code to Conda Build Recipes appeared first on Anaconda.

Filipe Saraiva's blog 2018-10-17 16:19:00

A arquitetura de compartilhamentos do Telegram para mitigar as fake news no WhatsApp

Fake News já se tornaram o tipo de problema que teremos que enfrentar de alguma maneira o quanto antes, ou veremos democracias sendo destruídas uma a uma. Se o caso Trump nos chamava atenção mas ainda parecia distante, as eleições brasileiras de 2018 vieram pra mostrar que o tiozão gente boa pode se converter no... [Read More]
Matthieu Brucher's blog 2018-10-16 07:09:00

Audio ToolKit: Moving to C++17

Audio ToolKit started with only C++11 a long time ago, and now with version 3.1, it’s going to be full C++17. Let’s start with the problem. In Audio ToolKit, I’m using a set of meta programming functions to enable automatic conversions between types. This enables the user to connect a float input to a double […]

So you want to contribute to open source

Welcome new open source contributor!

I appreciated receiving the e-mail where you said you were excited about getting into open source and were particularly interested in working on a project that I maintain. This post has a few thoughts on the topic.

First, please forgive me for sending you to this post rather than responding with a personal e-mail. Your situation is common today, so I thought I’d write up thoughts in a public place, rather than respond personally.

This post has two parts:

  1. Some pragmatic steps on how to get started
  2. A personal recommendation to think twice about where you focus your time
Look for good first issues on Github

Most open source software (OSS) projects have a “Good first issue” label on their Github issue tracker. Here is a screenshot of how to find the “good first issue” label on the Pandas project:

(note that this may be named something else like “Easy to fix”)

This contains a list of issues that are important, but also

.pyMadeThis 2018-09-30 06:00:00

Dictionary Views & Set Operations — Working with dictionary view objects

The keys, values and items from a dictionary can be accessed using the .keys(), .values() and .items() methods. These methods return view objects which provide a view on the source dictionary.

The view objects dict_keys and dict_items support set-like operations (the latter only when all values are hashable) which ...

.pyMadeThis 2018-09-30 06:00:00

Dictionary Views & Set Operations — Working with dictionary view objects

The keys, values and items from a dictionary can be accessed using the .keys(), .values() and .items() methods. These methods return view objects which provide a view on the source dictionary.

The view objects dict_keys and dict_items support set-like operations (the latter only when all values are hashable) which ...

Filipe Saraiva's blog 2018-09-28 04:53:50

Ciro em frente!

Faltando poucos dias para o 1º turno das eleições, aproveito o momento para declarar meu voto em Ciro Gomes e convido amigos e amigas a ponderarem e também votarem no candidato. Em um conceito bastante generoso de partidos políticos, tratam-se de organizações estruturadas em torno de uma ideia de ordenamento social e que tentam, através... [Read More]
Matthieu Brucher's blog 2018-09-25 07:19:33

Announcement: Audio TK 3.0.0

ATK is updated to 3.0.0 with a major ABI break and code quality improvement (see here). Bugs in different areas were fixed. Development for additional modules was also simplified (the modelling lite is such a project based on Audio Toolkit). Download link: ATK 3.0.0 Changelog: 3.0.0 * Change size for gsl::index everywhere (change of ABI) […]
.pyMadeThis 2018-09-23 07:00:00

3D wireframe cube with MicroPython — Basic 3D model rotation and projection

An ESP2866 is never going to compete with an actual graphics card. It certainly won't produce anything approaching modern games. But it still makes a nice platform to explore the basics of 3D graphics. In this short tutorial we'll go through the basics of creating a 3D scene ...

Filipe Saraiva's blog 2018-09-22 21:13:08

Akademy 2018

Procure seu colaborador favorito do KDE na Foto em grupo oficial do Akademy 2018 Estive em Viena para participar do Akademy 2018, o encontro anual do KDE. Este foi o meu quarto Akademy, sendo antecedido por Berlin’2012 (na verdade, Desktop Summit ), Brno’2014, e Berlin’2016 (junto com a QtCon). Interessante, vou ao Akademy a cada... [Read More]
Spyder Blog 2018-09-21 00:00:00

QtConsole 4.4 Released!

We're excited to announce a significant update to QtConsole—the package that powers Spyder's IPython Console interface—which the Spyder team maintains in collaboration with Project Jupyter. Two of the biggest changes—user-selectable syntax highlighting themes, and enhanced external editor/IDE integration—are already built right into Spyder, so they'll likely be of more interest if you use QtConsole standalone or with another editor/IDE. However, most of the other changes should prove quite useful within Spyder as well, and many were in fact suggested and even implemented by users of our IDE. Particular highlights include a block indent/unindent feature, Select-All (Ctrl-Shift-A) being made cell-specific, Ctrl-Backspace and Ctrl-Delete behaving more intelligently across whitespace and line boundaries, Ctrl-D allowing you to easily exit ipdb, input() and the like, and numerous smaller enhancements and bug fixes. If you'd like to learn more about what's new, please check out our article over on the Jupyter blog, where we go over the major changes in more detail, with plenty

Matthieu Brucher's blog 2018-09-20 07:28:44

Book review: Continuous Delivery With Docker And Jenkins

A decade ago, the objective was to have a build farm and do continuous integration (on each commit, build the application and run unit tests). Now, the objective is continuous delivery. This means that the new build is directly put into production. All the major applications are doing this, from Chrome to Spotify. You may […]
Matthieu Brucher's blog 2018-09-18 07:52:31

Compiling C++ code in memory with clang

I have tried to find the proper receipts to compile on the fly C++ code with clang and LLVM. It’s actually not that easy to achieve if you are not targeting LLVM Intermediate Representation, and unfortunately, the code here, working for LLVM 7, may not work for LLVM 8. Or 6. The pipeline There are […]

Dask Development Log

This work is supported by Anaconda Inc

To increase transparency I’m trying to blog more often about the current work going on around Dask and related projects. Nothing here is ready for production. This blogpost is written in haste, so refined polish should not be expected.

Since the last update in the 0.19.0 release blogpost two weeks ago we’ve seen activity in the following areas:

  1. Update Dask examples to use JupyterLab on Binder
  2. Render Dask examples into static HTML pages for easier viewing
  3. Consolidate and unify disparate documentation
  4. Retire the hdfs3 library in favor of the solution in Apache Arrow.
  5. Continue work on hyper-parameter selection for incrementally trained models
  6. Publish two small bugfix releases
  7. Blogpost from the Pangeo community about combining Binder with Dask
  8. Skein/Yarn Update
1: Update Dask Examples to use JupyterLab extension

The new dask-labextension embeds Dask’s dashboard plots into a JupyterLab session so that you can get easy access to information

Gaël Varoquaux - programming 2018-09-16 22:00:00

A foundation for scikit-learn at Inria

We have just announced that a foundation will be supporting scikit-learn at Inria [1]:

Growth and sustainability

This is an exciting turn for us, because it enables us to receive private funding. As a result, we will be able to have secure employment for some existing core …

Leonardo Uieda 2018-09-14 12:00:00

Introducing Verde

Verde is a Python library for processing spatial data (bathymetry, geophysics surveys, etc) and interpolating it on regular grids (i.e., gridding).

It implements Green's functions based interpolation methods and other data processing routines. The type of gridding implemented in Verde is essentially fitting various linear models to spatial data and using them to predict new data on regular grids, which is what a lot of machine learning is all about. So Verde's gridder API is inspired on scikit-learn, the state-of-the-art for machine learning in Python. The Green's functions that make up the Jacobian matrix (aka sensitivity or feature matrix) of the linear models generally come from elastic deformation theory. For example, the bi-harmonic spline (Sandwell, 1987) implemented in verde.Spline comes from the deformation of a thin elastic plate.

I submitted a

Pythonic Perambulations 2018-09-13 17:00:00

The Waiting Time Paradox, or, Why Is My Bus Always Late?

Image Source: Wikipedia License CC-BY-SA 3.0

If you, like me, frequently commute via public transit, you may be familiar with the following situation:

You arrive at the bus stop, ready to catch your bus: a line that advertises arrivals every 10 minutes. You glance at your watch and note the time... and when the bus finally comes 11 minutes later, you wonder why you always seem to be so unlucky.

Naïvely, you might expect that if buses are coming every 10 minutes and you arrive at a random time, your average wait would be something like 5 minutes. In reality, though, buses do not arrive exactly on schedule, and so you might wait longer. It turns out that under some reasonable assumptions, you can reach a startling conclusion:

When waiting for a bus that comes on average every 10 minutes, your average waiting time will be 10 minutes.

This is what is sometimes known as the waiting time paradox.

I've encountered this idea before, and always wondered

Filipe Saraiva's blog 2018-09-09 15:17:30

Akademy 2018

Look for your favorite KDE contributor at Akademy 2018 Group Photo This year I was in Vienna to attend Akademy 2018, the annual KDE world summit. It was my fourth Akademy after Berlin’2012 (in fact, Desktop Summit ), Brno’2014, and Berlin’2016 (together with QtCon). Interesting, I go to Akademy each 2 years – let’s try... [Read More] 2018-09-05 22:00:00

Three Operator Splitting

I discuss a recently proposed optimization algorithm: the Davis-Yin three operator splitting.

Dask Release 0.19.0

This work is supported by Anaconda Inc.

I’m pleased to announce the release of Dask version 0.19.0. This is a major release with bug fixes and new features. The last release was 0.18.2 on July 23rd. This blogpost outlines notable changes since the last release blogpost for 0.18.0 on June 14th.

You can conda install Dask:

conda install dask

or pip install from PyPI:

pip install dask[complete] --upgrade

Full changelogs are available here:

Notable Changes

A ton of work has happened over the past two months, but most of the changes are small and diffuse. Stability, feature parity with upstream libraries (like Numpy and Pandas), and performance have all significantly improved, but in ways that are difficult to condense into blogpost form.

That being said, here are a few of the more exciting changes in the new release.

Python Versions

We’ve dropped official support for Python 3.4 and added official support for Python 3.7.

Deploy on Hadoop Clusters

Over the past few months Jim Crist has bulit a suite of

Matthieu Brucher's blog 2018-09-04 07:36:41

Book: Building Machine Learning Systems with Python – third edition

A few year ago, Packt Publishing contacted to be a technical reviewer for the first edition of Building Machine Learning Systems with Python, and I was impressed by the writing of Luis Pedro Coelho and Willi Richert. For the second edition, I was again a technical reviewer. Writing is not easy, especially when it’s not […]
Planet SciPy – I Love Symposia! 2018-08-30 04:48:05

Summer school announcement: 2nd Advanced Scientific Programming in Python (ASPP) Asia Pacific!

The Advanced Scientific Programming in Python (ASPP) summer school has had 10 successful iterations in Europe and one iteration here in Melbourne earlier this year. Another European iteration is starting next week in Camerino, Italy. Now, thanks to the generous sponsorship of CSIRO, and the efforts of Benjamin Schwessinger and Genevieve Buckley, two alumni from … Continue reading Summer school announcement: 2nd Advanced Scientific Programming in Python (ASPP) Asia Pacific!
Living in an Ivory Basement 2018-08-28 22:00:00

Abstract for SIAM: Supporting and Sustaining Open Source Software Development: the Commons Perspective

How do we support and sustain open source software development?

High level performance of Pandas, Dask, Spark, and Arrow

This work is supported by Anaconda Inc


How does Dask dataframe performance compare to Pandas? Also, what about Spark dataframes and what about Arrow? How do they compare?

I get this question every few weeks. This post is to avoid repetition.

  1. This answer is likely to change over time. I’m writing this in August 2018
  2. This question and answer are very high level. More technical answers are possible, but not contained here.
Answers Pandas

If you’re coming from Python and have smallish datasets then Pandas is the right choice. It’s usable, widely understood, efficient, and well maintained.

Benefits of Parallelism

The performance benefit (or drawback) of using a parallel dataframe like Dask dataframes or Spark dataframes over Pandas will differ based on the kinds of computations you do:

  1. If you’re doing small computations then Pandas is always the right choice. The administrative costs of parallelizing will outweigh any benefit. You should not parallelize if your computations are taking less

.pyMadeThis 2018-08-27 16:00:00

Displaying images on OLED screens — Using 1-bpp images in MicroPython

We've previously covered the basics of driving OLED I2C displays from MicroPython, including simple graphics commands and text. Here we look at displaying monochrome 1 bit-per-pixel images and animations using MicroPython on a Wemos D1.

Processing the images and correct choice of image-formats is important to get the most ...

.pyMadeThis 2018-08-26 12:00:00

Dictionaries — An almost complete guide to Python's key:value store

Dictionaries are key-value stores, meaning they store, and allow retrieval of data (or values) through a unique key. This is analogous with a real dictionary where you look up definitions (data) using a given key — the word. Unlike a language dictionary however, keys in Python dictionaries are not alphabetically sorted ...

.pyMadeThis 2018-08-25 08:00:00

Driving I2C OLED displays with MicroPython — I2C monochrome displays with SSD1306

These mini monochrome OLED screens make great displays for projects — perfect for data readout, simple UIs or monochrome games.

Wemos D1 v2.2+ or good imitations. Buy
0.91in OLED Screen 128x32 pixels, I2c interface. Buy
Breadboard Any size will do. Buy
Wires Loose ends, or jumper leads.
Setting ...
.pyMadeThis 2018-08-23 19:00:00

Raindar — Desktop daily weather, forecast app in PyQt

The Raindar UI was created using Qt Designer, and saved as .ui file, which is available for download. This was converted to an importable Python file using pyuic5.

API key

Before running the application you need to obtain a API key from This key is unique to you ...

Public Institutions and Open Source Software

As general purpose open source software displaces domain-specific all-in-one solutions, many institutions are re-assessing how they build and maintain software to support their users. This is true across for-profit enterprises, government agencies, universities, and home-grown communities.

While this shift brings opportunities for growth and efficiency, it also raises questions and challenges about how these institutions should best serve their communities as they grow increasingly dependent on software developed and controlled outside of their organization.

  • How do they ensure that this software will persist for many years?
  • How do they influence this software to better serve the needs of their users?
  • How do they transition users from previous all-in-one solutions to a new open source platform?
  • How do they continue to employ their existing employees who have historically maintained software in this field?
  • If they have a mandate to support this field, what is the best role for them to play, and how can they justify their efforts to the groups that control their budget?

This blogpost


Cloud Lock-in and Open Standards

This post is from conversations with Peter Wang, Yuvi Panda, and several others. Yuvi expresses his own views on this topic on his blog.


When moving to the cloud we should be mindful to avoid vendor lock-in by adopting open standards.

Adoption of cloud computing

Cloud computing is taking over both for-profit enterprises and public/scientific institutions. The Cloud is cheap, flexible, requires little up-front investment, and enables greater collaboration. Cloud vendors like Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure compete to create stable, easy to use platforms to serve the needs of a variety of institutions, both big and small. This presents both a great opportunity for society, but also a risk of future lock-in at a large scale.

Cloud vendors build services to lock in users

Some of the competition between cloud vendors is about providing lower costs, higher availability, improved scaling, and so on, that are strictly a benefit for consumers. This is great.

However some of the competition is in the form of

Living in an Ivory Basement 2018-08-17 22:00:00

Can bits be the basis for a digital commons? (No.)

Bits cannot be the basis for a digital commons, because they are not rivalrous.

Spyder Blog 2018-08-14 00:00:00

Spyder 3.3.0 and 3.3.1 released!

We're pleased to release the next significant update in the stable Spyder 3 line, 3.3.0, along with its follow-on bugfix point release, 3.3.1, which is now live on PyPI and conda. As always, you can update with conda update spyder in the Anaconda Prompt/Terminal/command line (on Windows/macOS/Linux, respectively) if on Anaconda (recommended), or pip update spyder otherwise. If you run into any trouble, please carefully read our new installation documentation and consult our Troubleshooting Guide, which contains straightforward solutions to the vast majority of install-related issues users have reported.

As a new minor version (3.3), it makes several substantial changes to Spyder's underpinnings that deserve some explanation, particularly the newly modular and portable console system that's now separated into its own spyder-kernels package, opening up several new options for users running Spyder in different environments. There's also a brand-new error reporting process, new options in the IPython console, usability and performance improvements for the Variable Explorer, multiple new and changed dependency requirements

While My MCMC Gently Samples 2018-08-13 14:00:00

Hierarchical Bayesian Neural Networks with Informative Priors

(c) 2018 by Thomas Wiecki

Imagine you have a machine learning (ML) problem but only small data (gasp, yes, this does exist). This often happens when your data set is nested -- you might have many data points, but only few per category. For example, in ad-tech you may want predict …

Spyder Blog 2018-08-13 00:00:00

Spyder featured on Episode 1 of Open Source Directions web show

Quansight, the company recently founded by NumPy, SciPy and Anaconda creator Travis Oliphant to help connect companies with open source communities built around data science and machine learning, just released Episode 1 of its live webcast series, and it was all about Spyder! Spyder maintainer Carlos Córdoba, recently hired by Quansight and funded part-time to work on Spyder development as we announced a few weeks ago, was the featured guest on the show.

Carlos first shared his perspective on some of the key moments in Spyder's nearly 10-year development history, from its original creation by Pierre Raybaut and Carlos' initial involvement in the project to its more recent challenges and successes. He also demonstrated basic usage of Spyder, as well as some of its standout features, in a live on-screen demo. Carlos then went on to outline the current roadmap for Spyder 4 in the near future, and explained some of the key new features planned for it. Finally, he took

Neural Ensemble News 2018-08-10 17:48:00

NeuroML2/LEMS is moving into Neural Mass Models and whole brain networks

In the last months, as part of the Google Summer of Code 2018, I have been working on a project that aimed to implement neuronal models which represent averaged population activity on NeuroML2/LEMS. The project was supported by the INCF organisation and my mentor, Padraig Gleeson, and I had 3 months to shape and bring to life all the ideas that we had in our heads. This blog post summarises the core motivation of the project, the technical challenges, what I have done, and future steps.

NeuroML version 2 and LEMS were introduced in order to standardise the description of neuroscience computational models and facilitate the shareability of results among different research groups1. However, so far, NeuroML2/LEMS have focused on modelling spiking neurons and how information is exchanged between them in networks. With the introduction of neural mass models, NeuroML2/LEMS can be extended to study interactions between large-scale systems such as cortical regions and indeed whole brain dynamics. To achieve this,
Living in an Ivory Basement 2018-08-09 22:00:00

"Labor" and "Engaged effort"

Are "effort" and "labor" the same?

Building SAGA optimization for Dask arrays

This work is supported by ETH Zurich, Anaconda Inc, and the Berkeley Institute for Data Science

At a recent Scikit-learn/Scikit-image/Dask sprint at BIDS, Fabian Pedregosa (a machine learning researcher and Scikit-learn developer) and Matthew Rocklin (Dask core developer) sat down together to develop an implementation of the incremental optimization algorithm SAGA on parallel Dask datasets. The result is a sequential algorithm that can be run on any dask array, and so allows the data to be stored on disk or even distributed among different machines.

It was interesting both to see how the algorithm performed and also to see the ease and challenges to run a research algorithm on a Dask distributed dataset.


We started with an initial implementation that Fabian had written for Numpy arrays using Numba. The following code solves an optimization problem of the form

min_x \sum_{i=1}^n f(a_i^t x, b_i)
import numpy as np
from numba import njit
from sklearn.linear_model.sag import get_auto_step_size
from sklearn.utils.extmath import row_norms

def deriv_logistic(p, y):
    # derivative of logistic loss
Gaël Varoquaux - programming 2018-07-31 22:00:00

Sprint on scikit-learn, in Paris and Austin

Two weeks ago, we held a scikit-learn sprint in Austin and Paris. Here is a brief report, on progresses and challenges.

Several sprints

We actually held two sprint in Austin: one open sprint, at the scipy conference sprints, which was open to new contributors, and one core sprint, for more …

Leonardo Uieda 2018-07-26 12:00:00

Websites for Earth Scientists on the academic job hunt

This is a list of the websites I use to search for academic jobs in the Earth Sciences (geophysics, geology, oceanography, meteorology, etc). They've been very useful to me (I found my current position through the CIG mailing list) and I hope that this post can help others who are looking to take the next step in their academic careers.

These sites list everything from Masters and PhD scholarships to postdoc positions and tenure-track professorships. Note that they are biased toward the US, Canada, Oceania, and Europe.

Mailing lists

Sign up for these and get email updates when new opportunities are posted (most are updated daily):

  • ES_JOBS_NET: I get around 10 emails from this list a day. Lately, I'm seeing a lot of
Spyder Blog 2018-07-23 00:00:00

State of the Spyder, Part 2: Looking up

After sharing some major milestones, development progress, and other tidbits from the past six months in Part 1 of this series (check that one out first if you haven't already), we now have some amazing news to share with you all here in Part 2, along with other status updates. That's not all, though—Part 3 will look ahead toward Spyder 4 and beyond, unveiling and explaining our full roadmap and going over the future possibilities even further afield.

Spyder Wins NumFOCUS Development Grant

First up, we're thrilled to announce a major part of what's making that plan possible (along with your support, of course!). This May, Spyder was awarded a $3000 development grant from NumFOCUS, an organization promoting better science through open code, to help with finishing Spyder 4! NumFOCUS is a nonprofit dedicated to supporting key scientific computing projects; promoting sustainability in the open source ecosystem; educating the next generation of scientists, engineers, developers and data analysts through their flagship

Leonardo Uieda 2018-07-20 12:00:00

Introducing Pooch

A friend to fetch your sample data files.

Pooch is a Python package that manages downloading data files over HTTP and storing them in a local directory. It is meant to be used by other Python libraries that ship sample data files for use in documentation, workshops, demos, etc.

For example, your package could define a module that has functions to load sample data (like scikit-learn does). If you want the data to live on the web (like in the Github repo) instead of shipping it with your package, Pooch can keep track of it and download it to the user's computer only when it's needed.

This is what a module would look like using Pooch:

Module mypackage/
import pooch

# Get the version string from your project. You have one of these, right?
Planet SciPy – I Love Symposia! 2018-07-12 18:58:35

The road to scikit-image 1.0

This is the first in a series of posts about the joint scikit-image, scikit-learn, and dask sprint that took place at the Berkeley Insitute of Data Science, May 28-Jun 1, 2018. In addition to the dask and scikit-learn teams, the sprint brought together three core developers of scikit-image (Emmanuelle Gouillart, Stéfan van der Walt, and … Continue reading The road to scikit-image 1.0
Living in an Ivory Basement 2018-07-08 22:00:00

The Open Source Anti-Sisyphean League

We need an Open Source Anti-Sisyphean League!

python – Dr. Randal S. Olson 2018-07-04 20:41:32

Does batting order matter in Major League Baseball? A simulation approach

If you’ve ever watched Major League Baseball, one of the feature points of the sport is the batting line-up that each team decides upon before each game. Traditional baseball logic tells us that speedy, reliable hitters like Trea Turner should
Living in an Ivory Basement 2018-07-01 22:00:00

A framework for thinking about Open Source Sustainability?

Can we apply Common Pool Resource work to open online projects?

Living in an Ivory Basement 2018-06-25 22:00:00

How open is too open?

How open is too open?

Planet SciPy – I Love Symposia! 2018-06-20 11:19:46

What do scientists know about open source?

A friend recently pointed out this great talk by Matt Bernius, What students know and don’t know about open source. If you have even a minor interest in open source it’s worth a watch, but the gist is: in the US alone, there are about 200,000 students enrolled in a computer science major. Open source … Continue reading What do scientists know about open source?
Filipe Saraiva's blog 2018-06-17 13:19:41

De quando falei sobre ficção científica com minha psicóloga

— Então Amanda, sabe, as vezes penso que uma das coisas que me faz assim foram essas quantidades de ficção científica que li na infância e na adolescência… mas não qualquer ficção, digo apenas daquelas sobre viagens no tempo e realidades alternativas. Meu filme preferido é De Volta Para o Futuro, as histórias que mais... [Read More]
Paul Ivanov’s Journal 2018-06-12 07:00:00

Get in it

Two weeks ago, Project Jupyter had our only planned team meeting for 2018. There was too much stuff going on for me to write a poem during the event as I had in previous years (2016, and 2017), so I ended up reading one of the pieces I wrote during my evening introvert breaks in Cleveland at PyCon a few weeks earlier.

Once again, Fernando and Matthias had their gadgets ready to record (thank you both!). The video below was taken by Fernando.

Get in it
Time suspended
Gellatinous reality - the haze
submerged in murky drops summed
in swamp pond of life

believe and strive, expand the mind
A state sublime, when in your prime you came to
me and we were free to flow and fling our
cares, our dreams, our in-betweens, our
rêves perdues, our residue -- the lime of light
the black of sight -- all these converge and
merge the forks of friction filled with fright
and more -- the float of logs that plunges deep
beyond the fray, beyond the
.pyMadeThis 2018-06-11 06:00:00

7Pez — Desktop unzip application with custom window decoration

This is a functionally terrible unzip application, saved only by the fact that you get to look at a cat while using it.

The original idea reflected in the name 7Pez was actually worse — to rig it up so you had to push on the head to unzip each file ...

Filipe Saraiva's blog 2018-06-03 22:04:47

A greve pelo ponto biométrico

Poucos dias atrás, Belém saiu de uma greve dos rodoviários que colocou a cidade de joelhos. Por 5 dias Belém ficou sem qualquer ônibus, com o sindicato descumprindo a determinação ditada pela justiça do trabalho de 80% da frota na rua. Encarando pesadas multas por conta disso mas ainda assim firmes, essa situação demonstrou como... [Read More]
.pyMadeThis 2018-06-03 16:30:00

Failamp — Multimedia playlist & player in Python, using PyQt

Failamp is a simple audio & video mediaplayer implemented in Python, using the built-in Qt playlist and media handling features. It is modelled, very loosely on the original Winamp, although nowhere near as complete (hence the fail).

The main window

The main window UI was built using Qt Designer. The screenshot ...

.pyMadeThis 2018-05-31 06:00:00

Creating a window with PyQt5 — The first step in creating your GUI application

The first step in creating desktop applications with PyQt is getting a window to show up on your desktop. Thankfully, with PyQt that is pretty simple.

Below are a few short examples to creating PyQt apps and getting a window on the screen. If this works you know you have ...

.pyMadeThis 2018-05-27 19:00:00

QtWebEngineWidgets, the new browser API in PyQt 5.6 — Simplified page model and asynchronous methods

With the release of Qt 5.5 the Qt WebKit API was deprecated and replaced with the new QtWebEngine API, based on Chromium. The WebKit API was subsequently removed from Qt entirely with the release of Qt 5.6 in mid-2016.

The change to use Chromium for web widgets within ...

.pyMadeThis 2018-05-25 19:00:00

Brown Note — Desktop notes app using SQLAlchemy & PyQt

Relieve your creative blockages with these interactive desktop reminders.

Brown Note is a desktop notes application written in Python, using PyQt. The notes are implemented as decoration-less windows, which can be dragged around the desktop and edited. Details in the notes, and their position on the desktop, is stored in ...

Python – Meta Rabbit 2018-05-25 11:58:38

Quick followups: NGLess benchmark & Notebooks as papers

A quick follow-up on two earlier posts: We finalized the benchmark for ngless that I had discussed earlier: As you can see, NGLess performs much better than either MOCAT or htseq-count. We tried to use featureCounts too, but that completely failed to produce results for some of the samples (we gave it a whopping 1TB … Continue reading Quick followups: NGLess benchmark & Notebooks as papers
.pyMadeThis 2018-05-14 06:00:00

Lucky Cat Spinning-arm Display — Python-powered Maneki-neko persistence of vision scroller

This build started as something simple: a lucky cat which would turn on and off automatically in response to some event. Since lucky cats are associated with good fortune the idea was to make one do this every time I got paid. This was working pretty well but unfortunately, after ...

.pyMadeThis 2018-05-07 20:00:00

NSAViewer — Webcam viewer & photo booth in Python, using PyQt

This app isn't actually a direct line from your webcam to the NSA, it's a demo of using the webcam/camera support in Qt. The name is a nod to the paranoia (or is it...) of being watched through your webcam by government spooks.

I did consider making ...

Spyder Blog 2018-05-06 00:00:00

State of the Spyder, Part 1: Looking back

As we approach some major development milestones, now is as good a time as ever to share with you some perspective on where we've been, what's happening now, and where we're going in the world of Spyder. In this post, part one of a three part series, we'll take a look back over the past six months at some of the key events, accomplishments and challenges for Spyder and its community, and how that all leads up to where we are now.

Stay tuned right here, since part two will share several exciting announcements that affect the project (in a good way, we promise!) and its immediate future. Even better, part three will formally announce the next Spyder 3 release and—what I'm sure you are all looking forward to—the plan for the first official Spyder 4 beta, plus our schedule and feature roadmap for Spyder 4 and beyond!

A Call Answered

Starting off, as we announced back in mid-November, our funding from Anaconda, Inc was

While My MCMC Gently Samples 2018-05-03 14:00:00

An intuitive, visual guide to copulas

(c) 2018 by Thomas Wiecki

People seemed to enjoy my intuitive and visual explanation of Markov chain Monte Carlo so I thought it would be fun to do another one, this time focused on copulas.

If you ask a statistician what a copula is they might say "a copula is …

.pyMadeThis 2018-04-29 17:00:00

Megasolid Idiom — Simple rich text editor in Python

Megasolid Idiom is a rich text word processor implemented in Python and Qt. You can use it to open, edit and save HTML-formatted files, with a WYSIWYG (what you see is what you get) format view. Only basic formatting, headings, lists and images are supported.

Megasolid Idiom is based on ...

.pyMadeThis 2018-04-23 06:00:00

Calculon — Writing a simple desktop calculator in Python

Calculators are one of the simplest desktop applications, found by default on every window system. Over time these have been extended to support scientific and programmer modes, but fundamentally they all work the same.

In this short write up we implement a working standard desktop calculator using PyQt. This implementation ...

.pyMadeThis 2018-04-16 06:00:00

No2Pads — Basic Notepad editor in Python, using PyQt

Notepad doesn't need much introduction. It's a plaintext editor that's been part of Windows since the beginning, and similar applications exist in every GUI desktop ever created.

Here we reimplement Notepad in Python using PyQt, a task that is made particularly easy by Qt providing a text ...

python – Dr. Randal S. Olson 2018-04-12 00:08:59

Traveling salesman portrait in Python

Last week, Antonio S. Chinchón made an interesting post showing how to create a traveling salesman portrait in R. Essentially, the idea is to sample a bunch of dark pixels in an image, solve the well-known traveling salesman problem for