Planet SciPy

Quansight Labs 2023-09-19 00:00:00

Array API Support in scikit-learn

In this blog post, we share how scikit-learn enabled support for the Array API Standard.
scikit-learn Blog 2023-09-10 00:00:00

scikit-learn 2023 In-person Developer Sprint in Paris, France

Author: Reshama Shaikh , François Goupil 2023-09-07 08:15:37

Software Engineering Patterns for Machine Learning

Have you ever talked to your Front-end or Back-end engineer peers and noticed how much they care about code quality? Writing legible, reusable, and efficient code has always been a challenge in the software development community. Endless conversations happen every day across Github pull requests and Slack threads around this topic. How to best adapt… 2023-08-11 13:15:44

ML Pipeline Architecture Design Patterns (With 10 Real-World Examples)

There comes a time when every ML practitioner realizes that training a model in Jupyter Notebook is just one small part of the entire project. Getting a workflow ready which takes your data from its raw form to predictions while maintaining responsiveness and flexibility is the real deal. At that point, the Data Scientists or…
ListenData 2023-08-08 16:38:00

How to Run Windscribe VPN in Windows with Python

In this tutorial, we will show you how to run Windscribe VPN in Windows using Python Code. Windscribe is a popular VPN service that offers several features. Windscribe's free version maintains the same speed as the paid plans.

To read this article in full, please click here
This post appeared first on ListenData
ListenData 2023-08-08 14:52:00

How to Run Proton VPN in Windows with Python

In this tutorial, we will show you how to run Proton VPN in Windows using Python Code.


First you need to download and install the OpenVPN GUI. OpenVPN GUI is a user-friendly application that allows you to easily configure and manage OpenVPN connections on your computer. OpenVPN is a popular open-source VPN protocol that provides secure and encrypted connections over public networks.

To read this article in full, please click here
This post appeared first on ListenData 2023-08-04 14:10:10

Organizing ML Monorepo With Pants

Have you ever copy-pasted chunks of utility code between projects, resulting in multiple versions of the same code living in different repositories? Or, perhaps, you had to make pull requests to tens of projects after the name of the GCP bucket in which you store your data was updated? Situations described above arise way too… 2023-08-03 11:24:14

Learnings From Building the ML Platform at Stitch Fix

This article was originally an episode of the ML Platform Podcast, a show where Piotr Niedźwiedź and Aurimas Griciūnas, together with ML platform professionals, discuss design choices, best practices, example tool stacks, and real-world learnings from some of the best ML platform professionals. In this episode, Stefan Krawczyk shares his learnings from building the ML…
Filipe Saraiva's blog 2023-07-30 14:46:19

Mestrado em Ciência da Computação 2023.2 na UFPA: PLN e Metaheurísticas

Estamos com mais um processo seletivo para o Mestrado em Ciência da Computação na UFPA aberto, com entrada para agora em agosto de 2023. Dessa vez continuo procurando candidatos e candidatas que queiram desenvolver pesquisas na área de metaheurísticas, para quaisquer problemas combinatoriais que queiram aplicar. Esse ainda é um campo muito vasto e tenho… Continue a ler »Mestrado em Ciência da Computação 2023.2 na UFPA: PLN e Metaheurísticas 2023-07-18 11:20:16

Deploying Conversational AI Products to Production With Jason Flaks

This article was originally an episode of the MLOps Live, an interactive Q&A session where ML practitioners answer questions from other ML practitioners.  Every episode is focused on one specific ML topic, and during this one, we talked to Jason Falks about deploying conversational AI products to production. You can watch it on YouTube: Or…
ListenData 2023-07-04 18:10:00

How to Use ChatGPT for Data Science

In this article, we will explore how you, as a data scientist, can use ChatGPT to enhance your data science projects. ChatGPT is a powerful tool that can help you in various aspects of your work, from exploring and analyzing data to generating insights and helping you with coding and troubleshooting. It can also help you to learn data science faster.

To read this article in full, please click here
This post appeared first on ListenData
Quansight Labs 2023-06-28 00:00:00

PyCon US 2023 - An action-packed week

In this post I'm sharing my experience of traveling to the US for PyCon US 2023 2023-06-27 14:22:37

How to Use SHAP Values to Optimize and Debug ML Models

Picture this, you’ve dedicated countless hours to training and fine-tuning your model, meticulously analyzing mountains of data. Yet, you lack a clear understanding of the factors influencing its predictions and, as a result, find it hard to improve it further.  If you have ever found yourself in such a situation, trying to make sense of… 2023-06-27 09:36:21

MLOps Landscape in 2023: Top Tools and Platforms

As you delve into the landscape of MLOps in 2023, you will find a plethora of tools and platforms that have gained traction and are shaping the way models are developed, deployed, and monitored. To provide you with a comprehensive overview, this article explores the key players in the MLOps and FMOps (or LLMOps) ecosystems,…
Quansight Labs 2023-06-27 00:00:00

Numba Dynamic Exceptions

In the following blogpost, we will explore the newly added feature in Numba: Dynamic exception support. We will discuss the previous limitations and explain how Numba was enhanced to handle runtime exceptions.
ListenData 2023-06-19 14:32:00

How to build ChatGPT Clone in Python

In this article, we will see the steps involved in building a chat application and an answering bot in Python using the ChatGPT API and gradio.

Developing a chat application in Python provides more control and flexibility over the ChatGPT website. You can customize and extend the chat application as per your needs. It also help you to integrate with your existing systems and other APIs.

To read this article in full, please click here
This post appeared first on ListenData
Keep the gradient flowing 2023-06-13 22:00:00

On the Convergence of the Unadjusted Langevin Algorithm

The Langevin algorithm is a simple and powerful method to sample from a probability distribution. It's a key ingredient of some machine learning methods such as diffusion models and differentially private learning. In this post, I'll derive a simple convergence analysis of this method in the special case when the …

Spyder Blog 2023-06-08 00:00:00

Spyder gets CZI grant to add remote development features, and a new job opening!

During the last few years, Spyder has positioned itself as a popular data science IDE by combining interactive computing and ease of use with robust programming tools. However, limited remote development support compared to some other IDEs has hindered adoption, as many users would like to work with data and code on high performance computing (HPC) clusters or cloud providers like AWS, GCP or DigitalOcean while developing on their personal computers. Adding such features would open up many new research possibilities by enabling the scientific community to tackle data and compute-intensive programming tasks from the ease and efficiency of their local development environments. Thanks to a two-year grant from the Chan Zuckerberg Initiative, we will be now able to address this shortcoming.

Right now, users have two main options to work remotely using a local IDE (aside from a purely web browser-based approach, which is sometimes not available or desirable): They can either edit and execute their files in a terminal, which is not

(continued...) 2023-06-06 12:40:58

How to Build ML Model Training Pipeline

Hands up if you’ve ever lost hours untangling messy scripts or felt like you’re hunting a ghost while trying to fix that elusive bug, all while your models are taking forever to train. We’ve all been there, right? But now, picture a different scenario: Clean code. Streamlined workflows. Efficient model training. Too good to be…
ListenData 2023-06-06 11:57:00

Transformers Agent: AI Tool That Automates Everything

We have a new AI tool in the market called Transformers Agent which is so powerful that it can automate just about any task you can think of. It can generate and edit images, video, audio, answer questions about documents, convert speech to text and do a lot of other things.

Hugging Face, a well-known name in the open-source AI world, released Transformers Agent that provides a natural language API on top of transformers. The API is designed to be easy to use. With a single line code, it provides a variety of tools for performing natural language tasks, such as question answering, image generation, video generation, text to speech, text classification, and summarization.

To read this article in full, please click here
This post appeared first on ListenData 2023-06-05 13:53:41

What Does GPT-3 Mean For the Future of MLOps? With David Hershey

This article was originally an episode of the MLOps Live, an interactive Q&A session where ML practitioners answer questions from other ML practitioners.  Every episode is focused on one specific ML topic, and during this one, we talked to David Hershey about GPT-3 and the feature of MLOps. You can watch it on YouTube: Or… 2023-05-31 07:34:20

Building ML Platform in Retail and eCommerce

Getting machine learning to solve some of the hardest problems in an organization is great. And eCommerce companies have a ton of use cases where ML can help. The problem is, with more ML models and systems in production, you need to set up more infrastructure to reliably manage everything. And because of that, many…
ListenData 2023-05-26 09:38:00

Complete Guide to Massively Multilingual Speech (MMS) Model

In this article we have covered everything about the latest multilingual speech model from the basics of how it works to the step-by-step implementation of the model in Python.

Meta, the company that owns Facebook, released a new AI model called Massively Multilingual Speech (MMS) that can convert text to speech and speech to text in over 1,100 languages. It is available for free. It will not only help academicians and researchers across the world but also language preservationists or activists to document and preserve endangered languages to prevent their extinction.

MMS is trained on a large dataset of text and audio in over 1,100 languages. Another best part about the model is that it generates audio which sounds very natural, like human speech. It is also able to identify more than 4,000 spoken languages.

To read this article in full, please click here
This post appeared first on ListenData 2023-05-17 13:20:24

How to Build ETL Data Pipeline in ML

From data processing to quick insights, robust pipelines are a must for any ML system. Often the Data Team, comprising Data and ML Engineers, needs to build this infrastructure, and this experience can be painful. However, efficient use of ETL pipelines in ML can help make their life much easier. This article explores the importance… 2023-05-10 13:56:49

How to Save Trained Model in Python

When working on real-world machine learning (ML) use cases, finding the best algorithm/model is not the end of your responsibilities. It is crucial to save, store, and package these models for their future use and deployment to production. These practices are needed for a number of reasons: To reiterate, while saving and storing ML models…
Martin Fitzpatrick - python 2023-05-04 09:00:00

PyQt6 Book now available in Korean: 파이썬과 Qt6로 GUI 애플리케이션 만들기 — The hands-on guide to creating GUI applications with Python gets a new translation

I am very happy to announce that my Python GUI programming book Create GUI Applications with Python & Qt6 / PyQt6 Edition …

ListenData 2023-04-19 12:32:00

AutoGPT : Everything You Need To Know

In this post we have covered AutoGPT in detail. By end of this tutorial, you will not only understand how it works but also will be able to run it on your system. Auto-GPT has gained a significant amount of popularity in the media. It has become one of the most talked-about topics across various social media platforms after ChatGPT. It has not only captured the attention of people in Artifical Intelligence community but also people from other background. Media outlets across countries covered it and reported how it can automate everything ranging from simple to complex tasks.

Table of Contents

What is AutoGPT?

AutoGPT is an experimental open-source project built on the latest ChatGPT model i.e GPT-4. It is not limited to ChatGPT as it can also do web search and try to find information from internet. When a client gives us a project with instructions on what to do. We, as analysts, perform tasks to fulfill the project requirements.

ListenData 2023-04-09 08:58:00

Open Source GPT-4 Models Made Easy

In this post we will explain how Open Source GPT-4 Models work and how you can use them as an alternative to a commercial OpenAI GPT-4 solution. Everyday new open source large language models (LLMs) are emerging and the list gets bigger and bigger. We will cover these two models GPT-4 version of Alpaca and Vicuna. This tutorial includes the workings of the models, as well as their implementation with Python

Table of Contents

Vicuna Model Introduction : Vicuna Model

Vicuna was the first open-source model available publicly which is comparable to GPT-4 output. It was fine-tuned on Meta's LLaMA 13B model and conversations dataset collected from ShareGPT. ShareGPT is the website wherein people share their ChatGPT conversations with others.

Important Note : The Vicuna Model was primarily trained on the GPT-3.5 dataset because most of the conversations on ShareGPT during the model's development were based on GPT-3.5. But the model was evaluated based on
Living in an Ivory Basement 2023-04-06 22:00:00

snakemake for doing bioinformatics - inputs and outputs and more!

Slithering your way into bioinformatics with snakemake - inputs and outputs and more!

ListenData 2023-03-30 08:01:00

14 Free and Open Source Alternatives to ChatGPT

In this article we will explain how Open Source ChatGPT alternatives work and how you can use them to build your own ChatGPT clone for free. We will introduce you to 14 powerful open source alternatives to ChatGPT, such as GPT4All, Dolly 2, Vicuna, Alpaca GPT-4. We have provided Python code for each of these models so you can run them with ease in Python. By the end of this article you will have a good understanding of these models and will be able to compare and use them according to your requirements.

ChatGPT is not open source. It has had two recent popular releases GPT-3.5 and GPT-4. GPT-4 has major improvements over GPT-3.5 and is more accurate in producing responses. ChatGPT does not allow you to view or modify the source code as it is not publicly available. Hence there is a need for the models which are open source and available for free. By using these open source

Martin Fitzpatrick - python 2023-03-20 06:00:00

Getting Started With Git and GitHub in Your Python Projects — Version-Controlling Your Python Projects With Git and GitHub

Using a version control system (VCS) is crucial for any software development project. These systems allow developers to track changes …

ListenData 2023-03-12 07:26:00

Complete Guide to Visual ChatGPT

In this post, we will talk about how to run Visual ChatGPT in Python with Google Colab. ChatGPT has garnered huge popularity recently due to its capability of human style response. As of now, it only provides responses in text format, which means it cannot process, generate or edit images. Microsoft recently released a solution for the same to handle images. Now you can ask ChatGPT to generate or edit the image for you.

Demo of Visual ChatGPT

In the image below, you can see the final output of Visual ChatGPT - how it looks like.

To read this article in full, please click here
This post appeared first on ListenData
Martin Fitzpatrick - python 2023-03-06 06:00:00

Working With Classes in Python — Understanding the Intricacies of Python Classes

Python supports object-oriented programming (OOP) through classes, which allow you to bundle data and behavior in a single entity. Python …

Living in an Ivory Basement 2023-03-02 23:00:00

snakemake for doing bioinformatics - using wildcards to generalize your rules

Slithering your way into bioinformatics with snakemake, wildcard version

Quansight Labs 2023-02-15 00:00:00

Quansight Labs Annual Report 2022: Celebrating Growth and Sustainability in Open Source

Presenting our first annual report! Read about our project achievements, community initiatives, and work culture.
Living in an Ivory Basement 2023-01-22 23:00:00

snakemake for doing bioinformatics - a beginner's guide (part 2)

Slithering your way into bioinformatics with snakemake, round 2.

Living in an Ivory Basement 2023-01-13 23:00:00

snakemake for doing bioinformatics - a beginner's guide (part 1)

Slithering your way into bioinformatics with snakemake

Quansight Labs 2023-01-10 00:00:00

Python packaging & workflows - where to next?

Potential solutions for pain points when dealing with native code; what needs unifying in the Python packaging space, and how should that be approached?
Living in an Ivory Basement 2023-01-07 23:00:00

sourmash has a plugin interface!

Enabling plugins in sourmash, for less directed & more incoherent progress!

Filipe Saraiva's blog 2022-12-15 01:13:41

A obsolescência humana na novela

Passei o dia no trabalho brincando com o ChatGPT, a inteligência artificial para conversas. Travamos diálogos surreais e esdrúxulos: perguntei a ela como seria a América Latina caso tivesse sido colonizada pela Inglaterra e também qual a relação entre Senhor dos Anéis e Game of Thrones. Em outra, pedi que escrevesse um diálogo fictício entre… Continue a ler »A obsolescência humana na novela
Quansight Labs 2022-12-12 00:00:00

Sangho's Internship at Quansight with PyTorch-Ignite project

Blogpost of working on the PyTorch-Ignite project during internship at Quansight
ListenData 2022-12-09 08:31:00

ChatGPT-4 Is a Smart Analyst, Unlike GPT-3.5

ChatGPT has been trending on social media platforms. It has crossed one million users in just a week time. Those who haven't heard about ChatGPT, it's a large language model trained by OpenAI. In simple words, it's a chat bot which answers your questions and the responses it provides may sound human-like. It's an impressive machine learning solution. With the release of GPT-4 we can rely on it over Google search for learning on any topic.

Update: I updated this article with reviews on GPT-4.
Why ChatGPT-3.5 Isn't Smart enough, but GPT-4 is

You can't trust ChatGPT-3.5 for preparation on any certification or exam. It's a Big NO if you think you can refer ChatGPT-3.5 for answering questions in a telephonic interview round. Yes I know it's a cheating if you even use Google for the same but wanted to give a WARNING as many people do this and many social media influencers posted on how to leverage ChatGPT-3.5 for cracking

Quansight Labs 2022-12-05 00:00:00

Conda on Colaboratory

Surbhi Sharma shares her exciting experience working as an intern at Quansight Labs and contributing to condacolab, a tool that lets you deploy a Miniconda installation easily on Google Colab notebooks. This enables you to use conda or mamba to install new packages on any Colab session.
Spyder Blog 2022-11-30 00:00:00

Improvements to the Spyder IDE installation experience

Juan Sebastian Bautista, C.A.M. Gerlach and Carlos Cordoba also contributed to this post.

Spyder 5.4.0 was released recently, featuring some major enhancements to its Windows and macOS standalone installers. You'll now get more detailed feedback when new versions are available, and you can download and start the update to them from right within Spyder, instead of having to install them manually. In this post, we'll go over how these new update features work and how you can start using them!

Before proceeding, we want to acknowledge that this work was made possible by a Small Development Grant awarded to Spyder by NumFOCUS, which has enabled us to hire a new developer (Juan Sebastian Bautista Rojas) to be in charge of all the implementation details.

Before these improvements, Spyder already had a mechanism to detect more recent versions, but that functionality was very simple. There was a pop-up dialog warning that a new version was available, but users had to

scikit-learn Blog 2022-11-30 00:00:00

Interview with Meekail Zain, scikit-learn Team Member

Author: Reshama Shaikh , Meekail zain
Quansight Labs 2022-11-28 00:00:00

Zoom zoom zoom! Improving Accessibility in JupyterLab

Kulsoom Zahra learns about accessibility and fixes a part of the JupyterLab interface (that used to break when zoomed in) during her summer 2022 internship at Quansight Labs.
Spyder Blog 2022-11-18 12:00:00

Introducing the Spyder-Watchlist plugin

Spyder's Variable Explorer is a great tool which aids the development and debugging of Python code by displaying all variables from the current scope. One thing the Variable Explorer is missing is the ability to display the value of arbitrary, user-definable expressions while debugging. For example, it might be useful to see the value of a specific attribute of an object, or the value of an array at some index. Such a feature is known as a "watchlist" or "watches" in other Integrated Development Environments (IDEs). This blog post introduces the Watchlist plugin developed for Spyder.


The watchlist consists of a user-definable list of expressions. They are evaluated after each debugger step, and the result of the evaluation is displayed as a string. This means that value = str(eval(expression)) is performed behind the scenes, and the result is shown in the plugin. The watchlist is a very powerful tool, but this comes at a cost: Any side effect of an expression will affect the execution environment.

Expressions can be

Filipe Saraiva's blog 2022-11-15 02:42:48

Por que abandonamos os blogs?

Interface de escrita do Twitter Estamos nesses dias assistindo o Elon Musk destruir o Twitter. Se espera que nessa dinâmica, ao longo do tempo, a rede social vá perdendo usuários e relevância – isso se não explodir de uma vez, pois seu novo dono fala até em falência. Não é a primeira vez que uma… Continue a ler »Por que abandonamos os blogs?
Quansight Labs 2022-11-15 00:00:00

Making pygments accessible

accessible-pygments hosts curated WCAG-compliant themes for all your syntax highlighting needs.
Quansight Labs 2022-11-15 00:00:00

The new Spyder Editor documentation under the spotlights!

In this blogpost, I share my experience as a Google Season of Docs 2022 technical writer working on updating the Editor user documentation.
Quansight Labs 2022-11-14 00:00:00

Close Encounter with pandas and the Jedis of open source

Learning from awesome mentors and contributing to pandas open source
Quansight Labs 2022-11-10 00:00:00

Quansight Labs awarded three CZI EOSS Cycle 5 Grants

We are delighted to share details about new grants to support the sustainability of SciPy, conda-forge, and CuPy
scikit-learn Blog 2022-11-08 00:00:00

Pandas DataFrame Output for sklearn Transformers

Author: Sangam SwadiK
Quansight Labs 2022-11-07 00:00:00

Developing a Typer CLI for Nebari

The Nebari CLI consists of various commands the user needs to run to initialize, deploy, configure, and update Nebari.
Keep the gradient flowing 2022-10-14 22:00:00

The Russian Roulette: An Unbiased Estimator of the Limit

The idea for what was later called Monte Carlo method occurred to me when I was playing solitaire during my illness.

Stanislaw Ulam, Adventures of a Mathematician

The Russian Roulette offers a simple way to construct an unbiased estimator for the limit of a sequence. It allows for example to …

scikit-learn Blog 2022-10-13 00:00:00

scikit-learn and Hugging Face join forces

Author: Lysandre Debut , François Goupil
scikit-learn Blog 2022-09-29 00:00:00

scikit-learn Sprint in Salta, Argentina

Author: Juan Martín Loyola
Martin Fitzpatrick - python 2022-09-21 09:00:00

Getting started with VS Code for Python — Setting up a Development Environment for Python programming

Setting up a working development environment is the first step for any project. Your development environment setup will determine how …

Keep the gradient flowing 2022-08-25 22:00:00

Notes on the Frank-Wolfe Algorithm, Part III: backtracking line-search

Backtracking step-size strategies (also known as adaptive step-size or approximate line-search) that set the step-size based on a sufficient decrease condition are the standard way to set the step-size on gradient descent and quasi-Newton methods. However, these techniques are much less common for Frank-Wolfe-like algorithms. In this blog post I …

Quansight Labs 2022-08-07 00:00:00

Introducing the 2022 Interns Cohort

Quansight Labs is delighted to welcome its second cohort of 6 interns, who will work on a variety of open source projects and tasks
Spyder Blog 2022-07-25 12:00:00

New 2022 roadmap and grant funding

For the last couple of months, the Spyder team has been working on defining a new roadmap and submitting grant proposals to fund more features and improvements. We are pleased to announce our roadmap for the rest of 2022, and that two proposals were funded!

The roadmap

Considering the importance of sharing a clear perspective of where the Spyder project is going and where we will be focusing our efforts over the coming months, the team has created an initial roadmap for the rest of 2022. We prioritized the highlighted features and enhancements based on input from issues, face-to-face and virtual discussions, Stack Overflow, social media and other feedback, to try to best capture the interests of our users and community.

The proposals

To help make our roadmap achievable, we wrote and submitted proposals to several different venues and organizations in the last couple of months. While we have yet to hear back from some of them, two have already been funded!

The first was for the

Quansight Labs 2022-07-13 00:00:00

SciPy 2022 Accessibility Awareness Programs

Announcing the SciPy 2022 Accessibility Awareness Efforts
ListenData 2022-07-11 16:05:00

Pollution in India : Real-time AQI Data

Air pollution has become a serious problem in recent years across the world. Effects of Air Pollution is devastating and its harmful effects are not just limited to Humans but also animals and plants as well. It also leads to global warming which is esentially increasing air and ocean temperatures around the world.

Indian cities have been topping the list of polluted cities. In order to solve the problem of air pollution the most important thing is to track air pollution on real-time basis first which alerts people to avoid outdoor activities during high air Pollution. This post explains how you can fetch real-time Air Quality Index (AQI) of Indian cities using Python and R code. It allows both Python and R programmers to pull pollution data.

You can download the dataset which contains static information about Indian states, cities and AQI stations. Variables stored in this dataset will be used further to fetch real-time data.

Gaël Varoquaux - programming 2022-07-09 22:00:00

My Mayavi story: discovering open source communities

The Mayavi Python software, and my personal history: A thread on Python and scipy ecosystems, building open source codebase, and meeting really cool and friendly people

I am writing today as a goodbye to the project: I used to be one of the core contributors and maintainers but have been …

ListenData 2022-06-30 14:04:00

Pointwise mutual information (PMI) in NLP

Natural Language Processing (NLP) has secured so much acceptance recently as there are many live projects running and now it's not just limited to academics only. Use cases of NLP can be seen across industries like understanding customers' issues, predicting the next word user is planning to type in the keyboard, automatic text summarization etc. Many researchers across the world trained NLP models in several human languages like English, Spanish, French, Mandarin etc so that benefit of NLP can be seen in every society. In this post we will talk about one of the most useful NLP metric called Pointwise mutual information (PMI) to identify words that can go together along with its implementation in Python and R.

Table of Contents

What is Pointwise mutual information?

PMI helps us to find related words. In other words, it explains how likely the co-occurrence of two words than we would expect by chance. For example the word "Data Science" has a specific meaning when these

Acoular 2022-06-24 05:00:00

How to import your data into Acoular

Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array which is stored in an HDF5 file. This blog post explains how to convert data available in other formats into this file format. As examples for other file formats we will use both .csv (comma separated text files) and .mat (Matlab files).
Quansight Labs 2022-06-06 00:00:00

Checking for accessibility: thoughts and a checklist!

A non-exhaustive but totally honest checklist for accessibility review
Keep the gradient flowing 2022-05-26 22:00:00

On the Link Between Optimization and Polynomials, Part 5

Six: All of this has happened before.
Baltar: But the question remains, does all of this have to happen again?
Six: This time I bet no.
Baltar: You know, I've never known you to play the optimist. Why the change of heart?
Six: Mathematics. Law of averages. Let a complex …

scikit-learn Blog 2022-05-22 00:00:00

Interview with Norbert Preining, scikit-learn Team Member

Author: Reshama Shaikh , Norbert Preining
Martin Fitzpatrick - python 2022-05-19 09:00:00

PyQt6, PySide6, PyQt5 and PySide2 Books -- updated for 2022! — New editions extended and updated, now 780+ pages

Hello! Today I have released new digital editions of my PyQt5, PyQt6, PySide2 and PySide6 book Create GUI Applications with …

ListenData 2022-05-06 11:06:00

Only size-1 arrays can be converted to Python scalars

Numpy is one of the most used module in Python and it is used in a variety of tasks ranging from creating array to mathematical and statistical calculations. Numpy also bring efficiency in Python programming. While using numpy you may encounter this error TypeError: only size-1 arrays can be converted to Python scalars It is one of the frequently appearing error and sometimes it becomes a daunting challenge to solve it.
Meaning : Only Size 1 Arrays Can Be Converted To Python Scalars Error This error generally appears when Python expects a single value but you passed an array which consists of multiple values. For example : you want to calculate exponential value of an array but the function for exponential value was designed for scalar variable (which means single value). When you pass numpy array in the function, it will return this error. This error handling is to prevent your code to process further and avoids unexpected output from the (continued...)
scikit-learn Blog 2022-05-04 00:00:00

Interview with Lucy Liu, scikit-learn Team Member

Author: Reshama Shaikh , Lucy Liu
Quansight Labs 2022-05-03 00:00:00

The evolution of the SciPy developer CLI

The development story of a developer command-line interface (CLI) for the SciPy project, with exmaples
Living in an Ivory Basement 2022-04-21 22:00:00

Storing 64-bit unsigned integers in SQLite databases, for fun and profit

Storing unsigned longs in SQLite is possible, and can be fast.

Quansight Labs 2022-04-10 00:00:00

Why is writing blog posts hard?

In our weekly show and tell we got real about "why can writing blog posts be so hard?" and collaboratively wrote up this blog post about what we learned from the discussion.
Quansight Labs 2022-03-31 00:00:00

Making GPUs accessible to the PyData Ecosystem via the Array API Standard.

How we can use the Python Array API Standard with the fundamental libraries in the PyData ecosystem along with CuPy for making GPUs accessible to the users of these libraries
Living in an Ivory Basement 2022-03-04 23:00:00

The First Common Fund Data Ecosystem Hackathon

We ran a successful pilot hackathon, and we will run a second one soon!

Quansight Labs 2022-02-28 00:00:00

Jupyter accessibility efforts have a roadmap!

The Chan Zuckerberg Initiative has funded efforts to make the Jupyter ecosystem, starting with JupyterLab, more accessible. As a part of these increased efforts, the team will be providing a periodically updated list of what is currently being worked on and what is coming soon.
Filipe Saraiva's blog 2022-02-06 14:31:39

Mestrado em Ciência da Computação 2022: Metaheurísticas

Estamos ainda com algumas vagas abertas para o Mestrado em Ciência da Computação na UFPA, Belém. Os interessados, favor olhar as instruções para submissão na página de seleção do programa. Desde meu ingresso no programa venho orientando alunos em diferentes pesquisas sobre inteligência computacional aplicados a problemas de smart grids. Já tivemos trabalhos sobre sistemas multiagentes… Continue a ler »Mestrado em Ciência da Computação 2022: Metaheurísticas
Martin Fitzpatrick - python 2022-01-26 11:00:00

DiffCast: Hands-free Python Screencast Creator — Create reproducible programming screencasts without typos or edits

Programming screencasts are a popular way to teach programming and demo tools. Typically people will open up their favorite editor …

Quansight Labs 2022-01-19 00:00:00

Conda and Grayskull, the Masters of Software Packaging

Grayskull is an automatic conda recipe generator, with a focus on conda-forge.
Quansight Labs 2022-01-12 00:00:00

IPython 8.0, Lessons learned maintaining software

This is a companion post from the Official release of IPython 8.0. We hope it will help you apply best practices, and have an easier time maintaining your projects, or helping other.
Keep the gradient flowing 2022-01-09 23:00:00

Optimization Nuggets: Implicit Bias of Gradient-based Methods

When an optimization problem has multiple global minima, different algorithms can find different solutions, a phenomenon often referred to as the implicit bias of optimization algorithms. In this post we'll characterize the implicit bias of gradient-based methods on a class of regression problems that includes linear least squares and Huber …

Keep the gradient flowing 2021-12-14 23:00:00

Optimization Nuggets: Exponential Convergence of SGD

This is the first of a series of blog posts on short and beautiful proofs in optimization (let me know what you think in the comments!). For this first post in the series I'll show that stochastic gradient descent (SGD) converges exponentially fast to a neighborhood of the solution.

Quansight Labs 2021-12-10 00:00:00

A year of Jupyter community calls

A lot of us showed up for the code, but hung around for the community. We'll continue this post talking about the monthly Jupyter community calls, and how they help all jovyans, Project Jupyter's pet name for their developers and users, stay connected.
Quansight Labs 2021-11-17 00:00:00

A vision for extensibility to GPU & distributed support for SciPy, scikit-learn, scikit-image and beyond

In this post, we aim to articulate that vision and suggest a path to making it concrete, focusing on three libraries at the core of the PyData ecosystem: SciPy, scikit-learn and scikit-image.
Quansight Labs 2021-11-03 00:00:00

NumPy Benchmarking

My work was majorly focused on providing performance benchmarks to NumPy in realistic situations. The target was to show the world that NumPy is efficient in handling quasi real-life situations too.
Gaël Varoquaux - programming 2021-10-28 22:00:00

Hiring an engineer and post-doc to simplify data science on dirty data


Join us to work on reinventing data-science practices and tools to produce robust analysis with less data curation.

It is well known that data cleaning and preparation are a heavy burden to the data scientist.

Dirty data research

In the dirty data project, we have been conducting machine-learning research …

Quansight Labs 2021-10-21 00:00:00

Dataframe interchange protocol: cuDF implementation

In the next lines, I"ll try to capture my experience at Quansight Labs as an intern working on the cuDF implementation of the dataframe interchange protocol. cuDF is a dataframe library very much like pandas which operates on the GPU in order to benefit from its computing power.