SciPy

Planet SciPy

neptune.ai 2021-02-26 07:36:00

Adaptive Mutation in Genetic Algorithm with Python Examples

The genetic algorithm is a popular evolutionary algorithm. It uses Darwin’s theory of natural evolution to solve complex problems in computer science. But, to do so, the algorithm’s parameters need a bit of adjusting. One of the key parameters is mutation. It makes random changes in the chromosomes (i.e. solutions) in order to increase quality […]

The post Adaptive Mutation in Genetic Algorithm with Python Examples appeared first on neptune.ai.

jbencook 2021-02-25 19:35:00

NumPy all: Understanding np.all()

The np.all() function tests whether all elements in a NumPy array evaluate to true.
Anaconda Blog 2021-02-25 19:04:00

How Can Higher Education Better Prepare Students to Enter the Data Science Field?

The skills and experience needed to succeed as a data scientist today will not be the same as those required in five or ten years. The worlds of education and business must collaborate to best prepare students to enter the data science workforce.
neptune.ai 2021-02-25 09:58:00

Where Can You Learn About MLOPS? What Are the Best Books, Articles, or Podcasts to Learn MLOps?

MLOps is not a piece of cake. Especially in today’s changing environment. There are many challenges—construction, integrating, testing, releasing, deployment, and infrastructure management. You need to follow good practices and know how to adjust to the challenges. And if you don’t learn and develop your knowledge, you’ll fall out of the loop. The right resources […]

The post Where Can You Learn About MLOPS? What Are the Best Books, Articles, or Podcasts to Learn MLOps? appeared first on neptune.ai.

Quansight Labs 2021-02-25 08:00:00

Enhancements to Numba's guvectorize decorator

Starting from Numba 0.53, Numba will ship with an enhanced version of the @guvectorize decorator. Similar to the @vectorize decorator, @guvectorize now has two modes of operation:

  • Eager, or decoration-time compilation and
  • Lazy, or call-time compilation

Before, only the eager approach was supported. In this mode, users are required to provide a list of concrete supported types beforehand as its first argument. Now, this list can be omitted if desired and as one calls it, Numba dynamically generates new kernels for previously unsupported types.

Read more… (3 min remaining to read)

neptune.ai 2021-02-24 11:30:39

Wasserstein Distance and Textual Similarity

In many machine learning (ML) projects, there comes a point when we have to decide the level of similarity between different objects of interest.  We might be trying to understand the similarity between different images, weather patterns, or probability distributions. With natural language processing (NLP) tasks, we might be checking whether two documents or sentences […]

The post Wasserstein Distance and Textual Similarity appeared first on neptune.ai.

neptune.ai 2021-02-23 10:38:42

How To Manage a Deep Reinforcement Learning Research Team Part 2: Fractal Nature of Creative Work

In part one, we discussed the importance of focusing on a well-defined project. In this article, we’re diving even deeper, because we’re going to talk about the fractal nature of creative work, or why it’s hard to do meaningful work when your projects are built from sub-projects, which are built from sub-sub-projects, that are built […]

The post How To Manage a Deep Reinforcement Learning Research Team Part 2: Fractal Nature of Creative Work appeared first on neptune.ai.

jbencook 2021-02-22 19:23:00

Binary cross entropy explained

A simple NumPy implementation of the binary cross entropy loss function and some intuition about why it works.
While My MCMC Gently Samples 2021-02-23 15:00:00

Introducing PyMC Labs

After I left Quantopian in 2020, something interesting happened: various companies contacted me inquiring about consulting to help them with their PyMC3 models.

Usually, I don't hear how people are using PyMC3 -- they mostly show up on GitHub or Discourse when something isn't working right. So, hearing about all these …

Martin Fitzpatrick - python 2021-02-22 08:00:00

Using MicroPython and uploading libraries on Raspberry Pi Pico — Using rshell to upload custom code

MicroPython is an implementation of the Python 3 programming language, optimized to run microcontrollers. It's one of the options available for programming your Raspberry Pi Pico and a nice friendly way to get started with microcontrollers.

MicroPython can be installed easily on your Pico, by following the instructions on the …

neptune.ai 2021-02-22 07:50:00

The Best Feature Engineering Tools

When it comes to predictive models, the dataset always needs a good description. In the real world, datasets are raw and need plenty of work. If the model is to understand a dataset for supervised or unsupervised learning, there are several operations you need to perform and this is where feature engineering comes in. In […]

The post The Best Feature Engineering Tools appeared first on neptune.ai.

jbencook 2021-02-19 15:25:00

Filtering DataFrames with the .query() method in Pandas

Pandas provides a .query() method on DataFrame's with a convenient string syntax for filtering DataFrames. This post describes the method and gives simple usage examples.
neptune.ai 2021-02-18 19:22:08

PyTorch Lightning vs Ignite: What Are the Differences?

Pytorch is one of the most widely used deep learning libraries, right after Keras. It provides agility, speed and good community support for anyone using deep learning methods in development and research.  Pytorch has certain advantages over Tensorflow. As an AI engineer, the two key features I liked a lot are: Pytorch has dynamic graphs […]

The post PyTorch Lightning vs Ignite: What Are the Differences? appeared first on neptune.ai.

neptune.ai 2021-02-18 11:46:34

Keras Tuner: Lessons Learned From Tuning Hyperparameters of a Real-Life Deep Learning Model

The performance of your machine learning model depends on your configuration. Finding an optimal configuration, both for the model and for the training algorithm, is a big challenge for every machine learning engineer.  Model configuration can be defined as a set of hyperparameters which influences model architecture. In case of deep learning, these can be […]

The post Keras Tuner: Lessons Learned From Tuning Hyperparameters of a Real-Life Deep Learning Model appeared first on neptune.ai.

neptune.ai 2021-02-17 17:13:10

Experiment Tracking vs Machine Learning Model Management vs MLOps

It takes quite a lot of steps to take a machine learning model from idea to production. These steps can get too complex, too quickly.  In this article, we’ll focus on dissecting the three main aspects of model deployment.  These are: experiment tracking  machine learning model management  MLOps (machine learning operations)  At the end of […]

The post Experiment Tracking vs Machine Learning Model Management vs MLOps appeared first on neptune.ai.

jbencook 2021-02-15 20:54:00

Linear interpolation in Python: An np.interp() example

It's easy to linearly interpolate a 1-dimensional set of points in Python using the np.interp() function from NumPy.
neptune.ai 2021-02-15 16:17:50

Best Tools To Do ML Model Serving

Tools for model serving in machine learning can provide you with solutions to many of the data engineers and devops concerns. They have many functionalities that make it easier to manage your models. You can use them during the entire lifecycle of your ML project, beginning with building a trained model, to deploying, monitoring, providing […]

The post Best Tools To Do ML Model Serving appeared first on neptune.ai.

neptune.ai 2021-02-12 09:27:17

What Are The Best Machine Learning Conferences in 2021?

ML conferences are gaining popularity as the field constantly grows and presents new possibilities. Conferences are a great way to exchange the latest research results, learn from experts, and establish professional relationships that can help grow your business. Attending conferences is one of the best ways to learn and develop skills. Additionally, you can meet […]

The post What Are The Best Machine Learning Conferences in 2021? appeared first on neptune.ai.

NumFOCUS 2021-02-10 19:54:10

Job Posting | Events and Digital Marketing Coordinator

Job Title: Events and Digital Marketing Coordinator Position Overview The primary role of the Events and Digital Marketing Coordinator is to support and assist the Events Manager and the Community Communications and Marketing Manager to advance one of NumFOCUS’s primary missions of educating and building the community of users and developers of open source scientific […]

The post Job Posting | Events and Digital Marketing Coordinator appeared first on NumFOCUS.

jbencook 2021-02-09 12:00:00

NumPy meshgrid: Understanding np.meshgrid()

You can create multi-dimensional coordinate arrays using the np.meshgrid() function, which is also available in PyTorch and TensorFlow. But watch out! PyTorch uses different indexing by default so the results might not be the same.
jbencook 2021-02-08 20:23:00

SageMaker Studio quick start

A step-by-step quick start guide for SageMaker Studio. Start a Studio session, launch a notebook on a GPU instance and run object detection inference with a detectron2 pre-trained model.
jbencook 2021-02-02 22:47:00

PyTorch one hot encoding

PyTorch has a one_hot() function for converting class indices to one-hot encoded targets.
Living in an Ivory Basement 2021-02-01 23:00:00

Transition your Python project to use pyproject.toml and setup.cfg! (An example.)

Updating old Python packages, in this year of the PSF 2021!

jbencook 2021-01-30 12:55:00

Numpy pad: Understanding np.pad()

The np.pad() function has a complex, powerful API. But basic usage is very simple and complex usage is achievable! This post shows you how to use NumPy pad and gives a couple examples.
jbencook 2021-01-29 21:00:00

Iterating over rows in Pandas

When you absolutely have to iterate over rows in a Pandas DataFrame, use the .itertuples() method.
Martin Fitzpatrick - python 2021-01-28 14:00:00

SAM Coupé SCREEN$ Converter — Interrupt optimizing image converter

The SAM Coupé was a British 8 bit home computer that was pitched as a successor to the ZX Spectrum, featuring improved graphics and sound and higher processor speed.

The SAM Coupé's high-color MODE4 could manage 256x192 resolution graphics, with 16 colors from a choice of 128. Each pixel …

Anaconda Blog 2021-01-26 18:30:00

New Year’s resolutions for data scientists in 2021

The start of a new year is a popular time to recalibrate habits and set goals for improvement, both personally and professionally. First, it’s helpful to take stock of where things stand, in order to hone in on potential areas for betterment. In data science, the past year represented another step forward in the maturation of the discipline, especially with the onset of the COVID-19 pandemic. We saw researchers come together to harness the power of data and open-source software for public health, and concepts of statistical modeling become mainstream as we looked to curb the disease’s spread. At the same time, we had a glimpse into a future of division if we don’t pay heed to issues of bias in data and explainability in algorithms.
Living in an Ivory Basement 2021-01-24 23:00:00

A snakemake hack for checkpoints

snakemake checkpoints r awesome

Quansight Labs 2021-01-24 04:00:00

Python packaging in 2021 - pain points and bright spots

At Quansight we have a weekly "Q-share" session on Fridays where everyone can share/demo things they have worked on, recently learned, or that simply seem interesting to share with their colleagues. This can be about anything, from new utilities to low-level performance, from building inclusive communities to how to write better documentation, from UX design to what legal & accounting does to support the business. This week I decided to try something different: hold a brainstorm on the state of Python packaging today.

The ~30 participants were mostly from the PyData world, but not exclusively - it included people with backgrounds and preferences ranging from C, C++ and Fortran to JavaScript, R and DevOps - and with experience as end-users, packagers, library authors, and educators. This blog post contains the raw output of the 30-minute brainstorm (only cleaned up for textual issues) and my annotations on it (in italics) which capture some of the discussion during the session and links and context that may be helpful. I think it sketches a decent picture of

(continued...)
jbencook 2021-01-23 14:49:00

The PyTorch softmax() function with example usage

You can use the top-level torch.softmax() function from PyTorch for your softmax activation needs.
jbencook 2021-01-22 14:56:00

Installing packages in a Jupyter notebook

This post describes a trick for installing/upgrading Python packages in a Jupyter notebook. It's useful for scratch code, but don't do this when you need reproducible code.
Martin Fitzpatrick - python 2021-01-22 14:00:00

SAM Coupé Reader — Preserving FRED retro disk magazine text, by decoding the Entropy Reader

FRED was the most popular disk magazine for the SAM Coupé 8 bit home computer.Published by Colin MacDonald out of sunny Monifieth, Scotland, the magazine ran from it's first issue in 1990 through to it's last (82) in 1998.

For the SAM networking project I was hoping there might …

Quansight Labs 2021-01-22 14:00:00

Making SciPy's Image Interpolation Consistent and Well Documented

SciPy n-dimensional Image Processing

SciPy's ndimage module provides a powerful set of general, n-dimensional image processing operations, categorized into areas such as filtering, interpolation and morphology. Traditional image processing deals with 2D arrays of pixels, possibly with an additional array dimension of size 3 or 4 to represent color channel and transparency information. However, there are many scientific applications where we may want to work with more general arrays such as the 3D volumetric images produced by medical imaging methods like computed tomography (CT) or magnetic resonance imaging (MRI) or biological imaging approaches such as light sheet microscopy. Aside from spatial axes, such data may have additional axes representing other quantities such as time, color, spectral frequency or different contrasts. Functions in ndimage have been implemented in a general n-dimensional manner so that they can be applied across 2D, 3D or more dimensions. A more detailed overview of the module is available in the SciPy ndimage tutorial. SciPy's image functions are

(continued...)
jbencook 2021-01-21 16:15:00

The quick start guide to plotting histograms in Seaborn

The histplot() function in Seaborn is a great API for plotting histograms to visualize the distribution of your Pandas columns.
Martin Fitzpatrick - python 2021-01-21 07:00:00

micro:bit Space Invaders — MicroPython retro game in just 25 pixels

How much game can you fit into 25 pixels? Quite a bit it turns out.

This is a mini clone of arcade classic Space Invaders for the BBC micro:bit microcomputer. Using the accelerometer and two buttons for input, to can beat off wave after wave of aliens that advance …

jbencook 2021-01-15 11:00:00

Normalizing Images in PyTorch

You can use the torchvision Normalize() transform to subtract the mean and divide by the standard deviation for image tensors in PyTorch. But it's important to understand how the transform works and how to reverse it.
jbencook 2021-01-14 06:00:00

Dropping columns and rows in Pandas

There are a few ways to drop columns and rows in Pandas. This post describes the easiest way to do it and provides a few alternatives that can sometimes be useful.
jbencook 2021-01-12 06:00:00

The easiest way to rename a column in Pandas

Two easy recipes for renaming column(s) in a Pandas DataFrame.
jbencook 2021-01-11 06:00:00

Reshaping arrays: How the NumPy reshape operation works

This post explains how the NumPy reshape operation works, how to use it and gotchas to watch out for.
Anaconda Blog 2021-01-08 16:35:00

What’s to come in 2021: 5 predictions for the future of data science and AI/ML

Since our founding in 2012, we’ve set out to create a movement that brings together data science practitioners, enterprises, and the open-source community. Data science has gone from a “nice-to-have” to a requisite for most businesses, and we’re proud to have witnessed its growth and expansion in recent years. But we also know there’s still much more to come.
jbencook 2021-01-08 06:00:00

NumPy norm: Understanding np.linalg.norm()

You can calculate the L1 and L2 norms of a vector or the Frobenius norm of a matrix in NumPy with np.linalg.norm(). This post explains the API and gives a few concrete usage examples.
ListenData 2021-01-06 10:35:00

Run SAS in Python without Installation

Introduction
In the past few years python has gained a huge popularity as a programming language in data science world. Many banks and pharma organisations have started using Python and some of them are in transition stage, migrating SAS syntax library to Python. Many big organisations have been using SAS since early 2000 and they developed a hundreds of SAS codes for various tasks ranging from data extraction to model building and validation. Hence it's a marathon task to migrate SAS code to any other programming language. Migration can only be done in phases so day to day tasks would not be hit by development and testing of python code. Since Python is open source it becomes difficult sometimes in terms of maintaining the existing code. Some SAS procedures are very robust and powerful in nature its alternative in Python is still not implemented, might be doable but not a straightforward way for average developer or analyst.

Do you wish

(continued...)
Quansight Labs 2021-01-04 08:00:00

Welcoming Tania Allard as Quansight Labs co-director

Today I'm incredibly excited to welcome Tania Allard to Quansight as Co-Director of Quansight Labs. Tania (GitHub, Twitter, personal site) is a well-known and prolific PyData community member. In the past few years she has been involved as a conference organizer (JupyterCon, SciPy, PyJamas, PyCon UK, PyCon LatAm, JuliaCon and more), as a community builder (PyLadies, NumFOCUS, RForwards), as a contributor to Matplotlib and Jupyter, and as a regular speaker and mentor. She also brings relevant experience in both industry and academia - she joins us from Microsoft where she was a senior developer advocate, and has a PhD in computational modelling.

Read more… (4 min remaining to read)

jbencook 2021-01-04 06:00:00

Scientific notation in Python and NumPy

Using and suppressing scientific notation in Python and NumPy.
jbencook 2021-01-02 06:00:00

NumPy tile: Understanding np.tile()

The unofficial guide to np.tile() with examples.
Filipe Saraiva's blog 2020-12-30 12:43:56

Disnatia X/Potências de X

Nenhuma equipe de heróis me é tão querida quanto X-Men. Lá pelo final dos anos 90 comecei a colecionar por alguns anos, mas em seguida veio o fatídico aumento de preço com as Super-Heróis Premium, o que me acabou desmotivando a comprar. De lá para cá, acompanho esporadicamente, lendo notícias sobre, comprando uma ou outra… Continue a ler »Disnatia X/Potências de X
Quansight Labs 2020-12-22 09:00:00

Develop a JupyterLab Winter Theme

JupyterLab 3.0 is about to be released and provides many improvements to the extension system. Theming is a way to extend JupyterLab and benefits from those improvements.

While theming is often disregarded as a purely cosmetic endeavour, it can greatly improve software. Theming can be great help for accessibility, and the Jupyter team pays attention to making the default appearance accessibility-aware by using sufficient contrast. For users with a high visual acuity you may also choose to increase the information density.

Theming can also be a great way to improve communication by increasing or decreasing emphasis of the user interface, which can be of use for teaching or presenting. Theming may also help with security, for example, by having a clear distinction between staging and production.

Finally Theming can be a great way to express oneself, for example, by using a branded version of software that fits well into a context, or expressing one's artistic preferences or opinions.

In the following blog post, we will show you step-by-step how you

(continued...)
ListenData 2020-12-21 14:50:00

Wish Christmas with Python and R

This post is dedicated to all the Python and R Programming Lovers...Flaunt your knowledge in your peer group with the following programs. As a data science professional, you want your wish to be special on eve of christmas. If you observe the code, you may also learn 1-2 tricks which you can use later in your daily tasks.

Method 1 : Run the following program and see what I mean

R Code


paste(intToUtf8(acos(log(1))*180/pi-13),
toupper(substr(month.name[2],2,2)),
paste(rep(intToUtf8(acos(exp(0)/2)*180/pi+2^4+3*2),2), collapse = intToUtf8(0)),
LETTERS[5^(3-1)], intToUtf8(atan(1/sqrt(3))*180/pi+2),
toupper(substr(month.abb[10],2,2)),
intToUtf8(acos(log(1))*180/pi-(2*3^2)),
toupper(substr(month.name[4],3,4)),
intToUtf8(acos(exp(0)/2)*180/pi+2^4+3*2+1),
intToUtf8(acos(exp(0)/2)*180/pi+2^4+2*4),
intToUtf8(acos(log(1))*180/pi-13),
LETTERS[median(0:2)],
intToUtf8(atan(1/sqrt(3))*180/pi*3-7),
sep = intToUtf8(0)
)

Python Code


import math
import datetime

(chr(int(math.acos(math.log(1))*180/math.pi-13)) \
+ datetime.date(1900, 2, 1).strftime('%B')[1] \
+ 2 * datetime.date(1900, 2, 1).strftime('%B')[3] \
+ datetime.date(1900, 2, 1).strftime('%B')[7] \
+ chr(int(math.atan(1/math.sqrt(3))*180/math.pi+2)) \
+ datetime.date(1900, 10, 1).strftime('%B')[1] \
+ chr(int(math.acos(math.log(1))*180/math.pi-18)) \
+ datetime.date(1900, 4, 1).strftime('%B')[2:4] \
+ chr(int(math.acos(math.exp(0)/2)*180/math.pi+2**4+3*2+1)) \
+ chr(int(math.acos(math.exp(0)/2)*180/math.pi+2**4+2*4)) \
+ chr(int(math.acos(math.log(1))*180/math.pi-13)) \
+ "{:c}".format(97) \
+ chr(int(math.atan(1/math.sqrt(3))*180/math.pi*3-7))).upper()
Method 2 : Audio Wish for Christmas

Turn on computer speakers before running the code.

R Code



install.packages("audio")
library(audio)
christmas_file <- tempfile()
download.file("https://sites.google.com/site/pocketecoworld/merrychristmas1.wav", christmas_file, mode = "wb")
xmas
(continued...)
fa.bianp.net 2020-12-20 23:00:00

On the Link Between Optimization and Polynomials, Part 2

An analysis of momentum can be tightened using a combination Chebyshev polynomials of the first and second kind. Through this connection we'll derive one of the most iconic methods in optimization: Polyak momentum.

ListenData 2020-12-19 15:59:00

How to use variable in a query in pandas

Suppose you want to reference a variable in a query in pandas package in Python. This seems to be a straightforward task but it becomes daunting sometimes. Let's discuss it with examples in the article below.

Let's create a sample dataframe having 3 columns and 4 rows. This dataframe is used for demonstration purpose.


import pandas as pd
df = pd.DataFrame({"col1" : range(1,5),
"col2" : ['A A','B B','A A','B B'],
"col3" : ['A A','A A','B B','B B']
})
Filter a value A A in column col2
In order to do reference of a variable in query, you need to use @.
Mention
(continued...)
NumFOCUS 2020-12-18 21:21:54

NumFOCUS hires Open Source Developer Advocate!

  NumFOCUS is pleased to announce that Arliss Collins has been hired as our organization’s first Open Source Developer Advocate. Founded in 2012, NumFOCUS has finally grown beyond just providing non-technical needs for our 40+ sponsored projects! As our first technical hire, Arliss will work to help understand our projects from a technical perspective and […]

The post NumFOCUS hires Open Source Developer Advocate! appeared first on NumFOCUS.

NumFOCUS 2020-12-11 19:37:25

A Pivotal Time in NumFOCUS’s Project Aimed DEI Efforts

NumFOCUS is pleased to announce the launch of our Contributor Diversification & Retention Research Project funded by a grant from the Gordon and Betty Moore Foundation.  “We were eager to support NumFOCUS’s diversity initiative because it aims to get to the heart of what is preventing greater participation in data science. We are hopeful that […]

The post A Pivotal Time in NumFOCUS’s Project Aimed DEI Efforts appeared first on NumFOCUS.

Anaconda Blog 2020-12-10 17:21:00

Data literacy is for everyone - not just data scientists

Today more than ever before, businesses around the world are embracing data and data-driven decisions as key aspects of operating a modern, successful organization. Yet, despite the surface-level recognition of the importance of data and the insights it can provide, the process of harnessing data and mining those insights is often perceived as abstract and mysterious—a domain that solely belongs to experts with PhDs.
Anaconda Blog 2020-12-01 21:30:00

Six must-have soft skills for every data scientist

Today, it’s hard to imagine a world without data science. Over the last few decades, it has become ingrained in society. Particularly during the COVID-19 pandemic, data is front and center in headlines every day. That being said, it’s important to remember that data science is still a new field, and one not without its challenges to overcome.
NumFOCUS 2020-11-23 14:44:42

Anaconda Announces Multi-Year Partnership with NumFOCUS

A key stakeholder in the open source scientific computing ecosystem has further formalized their long-standing partnership with NumFOCUS. Anaconda, the Austin, Texas-based software development and consulting company which provides global distribution of Python and R software packages, last month introduced their Anaconda Dividend Program. Through this initiative, Anaconda plans to direct a portion of their […]

The post Anaconda Announces Multi-Year Partnership with NumFOCUS appeared first on NumFOCUS.

Anaconda Blog 2020-11-20 19:56:00

Anaconda Individual Edition 2020.11

There were also quite a few other bug fixes and improvements — to see the full list please visit the Anaconda Navigator 1.10.0 release notes here.
Quansight Labs 2020-11-19 17:29:55

A second CZI grant for NumPy and OpenBLAS

I am happy to announce that NumPy and OpenBLAS have once again been awarded a grant from the Chan Zuckerberg Initiative through Cycle 3 of the Essential Open Source Software for Science (EOSS) program. This new grant totaling $140,000 will fund part of our efforts to improve usability and sustainability in both projects and is excellent news for the scientific computing community, which will certainly benefit from this work downstream.

Read more… (4 min remaining to read)

NumFOCUS 2020-11-18 18:36:55

NumFOCUS Receives Support from Heising-Simons

NumFOCUS is grateful to announce that we received a grant award of $50,000 in October from the Heising-Simons Foundation. This generous grant funding will provide general support resources to NumFOCUS and will benefit all of our Sponsored and Affiliated Projects as well as our organization’s several programs and initiatives. “This grant award from Heising-Simons will […]

The post NumFOCUS Receives Support from Heising-Simons appeared first on NumFOCUS.

Quansight Labs 2020-11-18 05:00:30

Introduction to Design in Open Source

This blog post is a conversation. Portions lead by Tim George are marked with TG, and those lead by Isabela Presedo-Floyd are marked with IPF.

TG: When I speak with other designers, one common theme I see concerning why they chose this career path is they want to make a difference in the world. We design because we imagine a better world and we want to help make it real. Part of the reason we design as a career is we're unable to go through life without designing; we're always thinking about how things are and how they could be better. This ethos also exists in many open-source communities. It seems like it ought to be an ideal match.

So what's the disconnect? I'm still exploring that myself, but after a few years in open source I want to share my observations, experiences, and hope for a stronger collaboration between design and development. I don't think I have a complete solution, and some days I'm not even sure I grasp the entire

(continued...)
Anaconda Blog 2020-11-13 16:08:00

Behind the Code of Dask and pandas: Q&A with Tom Augspurger

Data science and related fields have been born in and pushed forward by open-source projects. Open-source communities allow for people to work together to solve larger problems. As stewards of the data science community, we believe it is important to go behind the lines of code to shine a light on those doing the work in open source. In a series of blogs, we’ll highlight several Anaconda employees, the open-source projects they work on, and how their work is making an impact on the larger field.
Quansight Labs 2020-11-13 06:00:00

Querying multiple backends with Ibis

In our recent Ibis post, we discussed querying & retrieving data using a familiar Pandas-like interface. That discussion focused on the fluent API that Ibis provides to query structure from a SQLite database—in particular, using a single specific backend. In this post, we'll explore Ibis's ability to answer questions about data using two different Ibis backends.

import ibis.omniscidb, dask, intake, sqlalchemy, pandas, pyarrow as arrow, altair, h5py as hdf5
Ibis in the scientific Python ecosystem

Before we delve into the technical details of using Ibis, we'll consider Ibis in the greater historical context of the scientific Python ecosystem. It was started by Wes McKinney, the creator of Pandas, as way to query information on the Hadoop distributed file system and PySpark. More backends were added later as Ibis became a general tool for data queries.

Throughout the rest of this post, we'll highlight the ability of Ibis to generically prescribe query expressions across different data storage systems.

Read more… (3 min remaining to

(continued...)
Quansight Labs 2020-11-12 19:00:06

Manylinux1 is obsolete, manylinux2010 is almost EOL, what is next?

The basic installation format for users who install packages via pip is the wheel format. Wheel names are composed of four parts: a package-name-and-version tag (which can be further broken down), a Python tag, an ABI tag, and a platform tag. More information on the tags can be found in PEP 425. So a package like NumPy will be available on PyPI as numpy-1.19.2-cp36-cp36m-win_amd64.whl for 64-bit windows and numpy-1.19.2-cp36-cp36m-macosx_10_9_x86_64.whl for macOS. Note that only the plaform tag win_amd64 or macosx_10_9_x86_64 differs.

But what about Linux? There is no single, vendor controlled, "Linux platform" e.g., Ubuntu, RedHat, Fedora, Debian, FreeBSD all package software at slightly different versions. What most Linux distributions do have in common is the glibc runtime library, and a smattering of various additional system libraries. So it is possible to define a least common denominator (LCD) of software expected to be on a Linux platform (exceptions apply, e.g. non-glibc distributions).

The decision to converge on a LCD common platform gave birth to the manylinux1 standard. Going back to our example, numpy

(continued...)
Anaconda Blog 2020-11-10 16:30:00

Behind the Code of HoloViz: Q&A with Sr. Software Engineer Philipp Rudiger

Data science and related fields have been born in and pushed forward by open-source projects. Open-source communities allow for people to work together to solve larger problems. As stewards of the data science community, we believe it is important to go behind the lines of code to shine a light on those doing the work in open source. In a series of blogs, we’ll highlight several Anaconda employees, the open-source projects they work on, and how their work is making an impact on the larger field.
Filipe Saraiva's blog 2020-11-05 14:50:03

Bate-papo com Vivi Reis sobre tecnologia e política

Hoje à noite (5 de novembro) às 20h conversarei com Vivi Reis, candidata a vereadora pelo PSOL em Belém. No bate-papo vamos focar bastante sobre temas que entrelaçam tecnologia e política. Entre os pontos, teremos o Escritório de Dados, dados e políticas públicas, software livre na administração pública, conectividade em Belém, inclusão digital, aplicativos cidadãos,… Continue a ler »Bate-papo com Vivi Reis sobre tecnologia e política
Spyder Blog 2020-11-05 00:00:00

New features in Spyder 4's new debugger!

IPython is a great improvement over the standard Python interpreter, bringing many enhancements such as autocompletion and "magic" commands. When debugging, however, many of these features become inaccessible. With Spyder, we aim to bring back these capabilities and more for a truly premium debugging experience! (And believe me, I use this debugger a lot, and not only because I write code that might contain bugs :p).

In this post, I will describe the debugger improvements we've already made in Spyder 4, as well as those that are already implemented or under review for Spyder 4.2 and beyond.

Make the debugger more like IPython

IPython improves on the stock Python interpreter by adding syntax highlighting, completion, and history. We have done the same for the debugger!

The output is prettier (and easier to read) than plain black text, as it was in Spyder 3!

Code completion and history for the debugger use the same functionality as the IPython console, so you should not notice any difference in behaviour. Just press

(continued...)
NumFOCUS 2020-11-04 00:10:51

JupyterCon 2020: Code of Conduct Reports

Following the reports to the NumFOCUS Code-of-Conduct committee on Jeremy Howard’s keynote at JupyterCon 2020, and the controversy that followed, the NumFOCUS Code of Conduct Committee issued a public apology to Jeremy Howard and escalated the case to the board of directors. The context In his keynote at JupyterCon 2020, Jeremy Howard gave a point-by-point rebuttal of […]

The post JupyterCon 2020: Code of Conduct Reports appeared first on NumFOCUS.

NumFOCUS 2020-10-30 18:51:02

Public Apology to Jeremy Howard

We, the NumFOCUS Code of Conduct Enforcement Committee, issue a public apology to Jeremy Howard for our handling of the JupyterCon 2020 reports. We should have done better. We thank you for sharing your experience and we will use it to improve our policies going forward. We acknowledge that it was an extremely stressful experience, […]

The post Public Apology to Jeremy Howard appeared first on NumFOCUS.

Paul Ivanov’s Journal 2020-10-29 07:00:00

Money and California Propositions (2020)

Ten years ago, I made some plots for how much money was contributed to and spent by the various proposition campaigns in California.

I decided to update these for this election, and here's the result:

Just in case you didn't get the full picture, here is the same data plotted on a common scale:

So, whereas 10 years ago, we had a total of ~$58 million on the election, the overwhelming amount of in support, this time, we had ~$662 million, an 11 fold increase!

The Cal-Access Campaign Finance Activity: Propositions & Ballot Measures source I used last time was still there, but there are way more propositions this time (12 vs 5), and the money details are broken out by committee, with some propositions have a dozen committees. Another wrinkle is that website has protected by some fancy scraping protection. I could browse it just fine in Firefox, even with Javascript turned off, but couldn't download it using wget, curl,

(continued...)
Anaconda Blog 2020-10-28 14:00:00

Sustaining the open-source DS/ML ecosystem with the Anaconda Dividend Program

Back in April, I shared a blog post about changes to our Terms of Service. In that post, I mentioned that, as we moved to a paid license model for our commercial users, we would plan to invest a portion of those profits back into the broader open-source community. Since that time, we have formalized our plan, and I’d like to share it with you today.
Anaconda Blog 2020-10-28 14:00:00

Anaconda Commercial Edition FAQ

We built our open-source product, Anaconda Individual Edition (aka Anaconda Distribution), with the intention of supporting individual open-source practitioners and researchers. Back in 2012, we didn’t know that what we were building would end up being used by tens of thousands of commercial organizations as machine learning and AI became increasingly critical to competitive advantage.
NumFOCUS 2020-10-26 18:13:17

TARDIS Joins NumFOCUS as a Sponsored Project

NumFOCUS is pleased to announce the newest addition to our fiscally sponsored projects: TARDIS TARDIS is an open-source, Monte Carlo based radiation transport simulator for supernovae ejecta. TARDIS simulates photons traveling through the outer layers of an exploded star including relevant physics like atomic interactions between the photons and the expanding gas. The TARDIS collaboration […]

The post TARDIS Joins NumFOCUS as a Sponsored Project appeared first on NumFOCUS.

Filipe Saraiva's blog 2020-10-26 13:51:04

Por um Escritório de Dados para Políticas Públicas em Belém

Dados sempre foram determinantes para a concepção e implementação de políticas públicas nas mais diferentes esferas governamentais. Acompanhamentos de indicadores econômicos, de saúde, de violência, de deslocamentos urbanos, de distribuição espacial da população, de áreas de cobertura de locais de lazer, entre outros, são apenas alguns dos dados que podem embasar o desenho de políticas… Continue a ler »Por um Escritório de Dados para Políticas Públicas em Belém
ListenData 2020-10-23 16:03:00

Translating Web Page while Scraping

Suppose you need to scrape data from a website after translating the web page in R and Python. In google chrome, there is an option (or functionality) to translate any foreign language. If you are an english speaker and don't know any other foreign language and you want to extract data from the website which does not have option to convert language to English, this article would help you how to perform translation of a webpage.
What is Selenium?You may not familiar with Selenium so it is important to understand the background. Selenium is an open-source tool which is very popular in testing domain and used for automating web browsers. It allows you to write test scripts in several programming languages. Selenium is available in both R and Python. Translate Page in Web Scraping in R and PythonIn R there is a package named RSelenium whereas Selenium can be installed by installing selenium package in Python. (continued...)
NumFOCUS 2020-10-23 15:25:08

NumFOCUS Earns Transparency Recognition from GuideStar

Earlier this week, NumFOCUS earned our first-ever Silver Seal of Transparency from GuideStar, an independent organization which classifies nonprofit organizations based on multiple metrics pertaining to transparency and accountability. Fewer than 5% of US-based nonprofits have received this type of recognition. “This respected acknowledgment comes as we prepare to enter our year-end fundraising season,” said […]

The post NumFOCUS Earns Transparency Recognition from GuideStar appeared first on NumFOCUS.

ListenData 2020-10-11 14:45:00

Learn Python for Data Science

This tutorial would help you to learn Data Science with Python by examples. It is designed for beginners who want to get started with Data Science in Python. Python is an open source language and it is widely used as a high-level programming language for general-purpose programming. It has gained high popularity in data science world. In the PyPL Popularity of Programming language index, Python scored second rank with a 14 percent share. In advanced analytics and predictive analytics market, it is ranked among top 3 programming languages for advanced analytics.
Data Science with Python Tutorial
Introduction
Python is widely used and very popular for a variety of software engineering tasks such as website development, cloud-architecture, back-end etc. It is equally popular in data science world. In advanced analytics world, there has been several debates on R vs. Python. There are some areas such as number of libraries for statistical analysis, where R wins over Python but Python is catching up
(continued...)
Paul Ivanov’s Journal 2020-10-08 07:00:00

aka: also known as

I was chatting with Anthony Scopatz last week, and one of the things we covered was how it'd be cool to have a subcommand launcher, kind of like git, where the subcommands were swappable. If you're not familiar, git automatically calls out to git-something (note the dash) whenever you run

$ git something

and something is not one of the builtin git commands. For me, ~/bin is in my PATH, so

$ git lost
git: 'lost' is not a git command. See 'git --help'.
$ echo "echo how rude!" > ~/bin/git-lost; chmod +x ~/bin/git-lost
$ git lost
how rude!

And so what Anthony was talking about was having two commands that are supposed to do the same thing, and being able to switch between them. For example: maybe we have git-away and git-gone and both of them perform a similar function, and we wish call our preferred one when we run git lost.

One way to do this would be to copy or symlink our chosen version as git-lost, and replace that file whenever

(continued...)
Quansight Labs 2020-09-29 16:00:00

Design of the Versioned HDF5 Library

In a previous post, we introduced the Versioned HDF5 library and described some of its features. In this post, we'll go into detail on how the underlying design of the library works on a technical level.

Read more… (6 min remaining to read)

ListenData 2020-09-20 08:18:00

How to rename columns in Pandas Dataframe

In this tutorial, we will cover various methods to rename columns in pandas dataframe in Python. Renaming or changing the names of columns is one of the most common data wrangling task. If you are not from programming background and worked only in Excel Spreadsheets in the past you might feel it not so easy doing this in Python as you can easily rename columns in MS Excel by just typing in the cell what you want to have. If you are from database background it is similar to ALIAS in SQL. In Python there is a popular data manipulation package called pandas which simplifies doing these kind of data operations.
2 Methods to rename columns in Pandas
In Pandas there are two simple methods to rename name of columns.

First step is to install pandas package if it is not already installed. You can check if the package is installed on your machine by running

(continued...)
Filipe Saraiva's blog 2020-08-29 18:48:00

Seqtembro de eventos virtuais e gratuitos sobre Qt e KDE

(Ok a piada com seqtembro funciona melhor na versão em inglês, seqtember, mas simbora) Por uma grande coincidência, obra do destino, ou nada disso, teremos um Setembro de 2020 repleto de eventos virtuais e gratuitos de alta qualidade sobre Qt e KDE. Começando de 4 à 11 do referido mês teremos o Akademy 2020, o… Continue a ler »Seqtembro de eventos virtuais e gratuitos sobre Qt e KDE
Neural Ensemble News 2020-08-08 19:27:00

CARLsim5 Released!

Introduction

CARLsim5 is an efficient, easy-to-use, GPU-accelerated library for simulating large-scale spiking neural network (SNN) models with a high degree of biological detail. It allows execution of networks of Izhikevich spiking neurons with realistic synaptic dynamics using multiple off-the-shelf GPUs and x86 CPUs. The simulator provides a PyNN-like programming interface in C/C++, which allows for details and parameters to be specified at the synapse, neuron, and network level.


The present release, CARLsim 5, builds on the efficiency and scalability of earlier releases (Nageswaran et al., 2009; Richert et al., 2011, and Beyeler et al., 2015; Chou et al., 2018). The functionality of the simulator has been greatly expanded by the addition of a number of features that enable and simplify the creation, tuning, and simulation of complex networks with spatial structure.


New Features

1. PyNN Compatibility

pyCARL is a interface between the simulator-independent language PyNN and a CARLsim5 based back-end. In other words, you can write the code for a SNN model once, using the

(continued...)
Filipe Saraiva's blog 2020-08-04 23:27:02

O que será do Lev com o “fim” da Saraiva?

Disclaimer: apesar do sobrenome, não tenho qualquer relação com a Saraiva. E também não tenho respostas para a pergunta do título. Como usuário do Lev acompanho com interesse a agonia da Saraiva. A rede de livrarias, uma das maiores do Brasil, está há anos em um imbróglio judicial devendo diversas editoras, em um processo que… Continue a ler »O que será do Lev com o “fim” da Saraiva?
NumFOCUS 2020-07-31 17:52:20

Dask Life Sciences Fellow [Open Job]

Dask is an open-source library for parallel computing in Python that interoperates with existing Python data science libraries like Numpy, Pandas, Scikit-Learn, and Jupyter.  Dask is used today across many different scientific domains. Recently, we’ve observed an increase in use in a few life sciences applications: Large scale imaging in microscopy Single cell analysis Genomics […]

The post Dask Life Sciences Fellow [Open Job] appeared first on NumFOCUS.

Spyder Blog 2020-07-25 10:00:00

STX Next, Python development company, uses Spyder to improve their workflow

STX Next, one of Europe's largest Python development companies, has shared with us how Spyder has been a powerful tool for them when performing data analysis. It is a pleasure for us on the Spyder team to work every day to improve the workflow of developers, scientists, engineers and data analysts. We are very glad to receive and share a STX Next testimonial about Spyder, along with an interview with one of their developers, Michael Wiśniewski, who has found Spyder very useful in his job.

What Michael Wiśniewski says about Spyder

In an era of a continuously growing demand for analysis of vast amounts of data, we are facing increasingly complex tasks to perform. Sure, we are not alone—there are many great tools designed for scientists and data analysts. We have NumPy, SciPy, Matplotlib, Pandas, and others. But, wouldn't it be nice to have one extra tool that could combine all the required packages into one compact working environment? Asking this question is precisely how

(continued...)
NumFOCUS 2020-07-24 16:31:53

NumFOCUS Introduces New Supporter Program

Today NumFOCUS is pleased to introduce a new program for our individual supporters, called Open Science Champions. Each year, our community members generously support NumFOCUS and our Projects in several ways; this program is intended to connect these various forms of support so that we can engage with our community most effectively and offer our […]

The post NumFOCUS Introduces New Supporter Program appeared first on NumFOCUS.

Filipe Saraiva's blog 2020-07-24 14:49:05

Educação Vigiada

Essa época de pandemia tem sido de produção em muitas frentes, o que infelizmente implica na redução de tempo para divulgação das mesmas aqui no blog. Nesse post quero me redimir dessa falta falando de um dos projetos que acho dos mais importantes que contribui recentemente, o Educação Vigiada. Há alguns meses o projeto Educação… Continue a ler »Educação Vigiada
NumFOCUS 2020-07-14 20:36:34

Open Source Developer Advocate

Position Overview The primary role of the Open Source Developer Advocate is to represent and support developers of NumFOCUS open source projects by serving as a link to internal and external stakeholders as well as the global user community. You will generate attention and support by applying your technical knowledge, passion for open source data […]

The post Open Source Developer Advocate appeared first on NumFOCUS.

Filipe Saraiva's blog 2020-07-10 23:09:48

Engrenagem Ep. 04 – Aplicações KDE favoritas dos KDErs brasileiros

Nesse sábado dia 11/07 às 10h o KDE Brasil vai voltar com episódios do Engrenagem, o videocast da comunidade brasileira (que está há 4 anos sem episódios inéditos 🙂 ). Para retomar os trabalhos, o episódio trará 6 colaboradores brasileiros (Ângela, Aracele, Caio, Filipe (eu), Fred e Tomaz) falando de suas aplicações KDE favoritas –… Continue a ler »Engrenagem Ep. 04 – Aplicações KDE favoritas dos KDErs brasileiros
Spyder Blog 2020-07-08 10:00:00

Writing docs is not just writing docs

This blogpost was originally published on the Quansight Labs website.

I joined the Spyder team almost two years ago, and I never thought I was going to end up working on docs. Six months ago I started a project with CAM Gerlach and Carlos Cordoba to improve Spyder’s documentation. At first, I didn’t actually understand how important docs are for software, especially for open source projects. However, during all this time I’ve learned how documentation has a huge impact on the open-source community and I’ve been thankful to have been able to do this. But, from the beginning, I asked myself “why am I the ‘right person’ for this?”

Improving Spyder’s documentation started as part of a NumFOCUS Small Development Grant awarded at the end of last year. The goal of the project was not only to update the documentation for Spyder 4, but also to make it more user-friendly, so users can understand Spyder’s key concepts and get started with it more

(continued...)
Filipe Saraiva's blog 2020-06-25 13:15:22

Sobre o livro “Uma História de Desigualdade”

Finalizei a leitura do premiado livro do Pedro de Souza, “Uma História de Desigualdade – A Concentração de Renda entre os Ricos no Brasil 1926 – 2013“, baseado na tese que defendeu no programa de sociologia da UnB. É um livro de fôlego e que faz jus a todos os elogios que recebeu desde o… Continue a ler »Sobre o livro “Uma História de Desigualdade”
Spyder Blog 2020-06-12 18:00:00

Thanking the people behind Spyder 4

This blogpost was originally published on the Quansight Labs website.

After more than three years in development and more than 5000 commits from 60 authors around the world, Spyder 4 finally saw the light on December 5, 2019! I decided to wait until now to write a blogpost about it because shortly after the initial release, we found several critical performance issues and some regressions with respect to Spyder 3, most of which are fixed now in version 4.1.3, released on May 8th 2020.

This new release comes with a lengthy list of user-requested features aimed at providing an enhanced development experience at the level of top general-purpose editors and IDEs, while strengthening Spyder's specialized focus on scientific programming in Python. The interested reader can take a look at some of them in previous blog posts, and in detail in our Changelog. However, this post is not meant to describe those improvements, but to acknowledge all people that contributed

(continued...)
Gaël Varoquaux - programming 2020-05-27 22:00:00

Technical discussions are hard; a few tips

Note

This post discuss the difficulties of communicating while developing open-source projects and tries to gives some simple advice.

A large software project is above all a social exercise in which technical experts try to reach good decisions together, for instance on github pull requests. But communication is difficult, in …

Paul Ivanov’s Journal 2020-05-17 07:00:00

Lazy River of Curious Content 0

This is the first post of what I'm calling a Lazy River of Curious Content. This is a way to review stuff that I've been doing, dealing with, or find interesting during the week recently (This was originally written two weeks ago, May 3rd, my shoddy internet connectivity kept me from posting it.). I'm loosely following the format that Justin Sherrill uses with great effect over at https://dragonflydigest.com

Learn NixOS by turning a Raspberry Pi into a Wireless Router Friend of the show, Anthony Scopatz, tried NixOS for the first time and provides a detailed report:

"While I had read the NixOS pamphlets, and listened politely when the faithful came knocking on my door at inconvenient times, I had never walked the path of functional Linux enlightenment myself"

Reading through that made me file away a todo of writing up how I use propellor (and why). But those todo sometimes just pile up for a while...

An interview of one of my long time nerd-crushes, Rob Pike. The questions focus on the Go programming

(continued...)
Living in an Ivory Basement 2020-05-06 22:00:00

sourmash databases as zip files, in sourmash v3.3.0

Use compressed databases directly!

Filipe Saraiva's blog 2020-05-05 18:29:16

LaKademy 2019

Em novembro passado, colaboradores latinoamericanos do KDE desembarcaram em Salvador/Brasil para participarem de mais uma edição do LaKademy – o Latin American Akademy. Aquela foi a sétima edição do evento (ou oitava, se você contar o Akademy-BR como o primeiro LaKademy) e a segunda com Salvador como a cidade que hospedou o evento. Sem problemas… Continue a ler »LaKademy 2019
Filipe Saraiva's blog 2020-05-04 21:20:54

Akademy 2019

Em setembro de 2019 a cidade italiana de Milão sediou o principal encontro mundial dos colaboradores do KDE – o Akademy, onde membros de diferentes áreas como tradutores, desenvolvedores, artistas, pessoal de promo e mais se reúnem por alguns dias para pensar e construir o futuro dos projetos e comunidade(s) do KDE Antes de chegar… Continue a ler »Akademy 2019
Spyder Blog 2020-04-22 17:00:00

Creating the ultimate terminal experience in Spyder 4 with Spyder-Terminal

This blogpost was originally published on the Quansight Labs website.

The Spyder-Terminal project is revitalized! The new 0.3.0 version adds numerous features that improve the user experience, and enhances compatibility with the latest Spyder 4 release, in part thanks to the improvements made in the xterm.js project.

Upgrade to ES6/JSX syntax

First, we were able to update all the old JavaScript files to use ES6/JSX syntax and the tests for the client terminal. This change simplified the code base and maintenance and allows us to easily extend the project to new functionalities that the xterm.js API offers. In order to compile this code and run it inside Spyder, we migrated our deployment to Webpack.

Multiple shells per operating system

In the new release, you now have the ability to configure which shell to use in the terminal. On Linux and UNIX systems, bash, sh, ksh, zsh, csh, pwsh, tcsh, screen, tmux, dash and rbash are supported, while cmd and powershell are the

(continued...)
Living in an Ivory Basement 2020-04-19 22:00:00

Software and workflow development practices (April 2020 update)

How we develop software and workflows in the DIB Lab, in 2020.

Martin Fitzpatrick - python 2020-04-13 11:01:00

Is it getting better yet? — An optimistic visual guide to the Coronavirus pandemic

As the apocalypse rumbles on, I found myself wondering "Is it getting any better?"

Daily updates of spiralling case numbers (and worse, deaths) does little to give a sense of whether we're getting to, or already past, the worst of it.

To answer that question for myself and you, I …