Planet SciPy

Major Price Cuts: Deepnote Versus Cocalc --- Compute Server Pricing

Major Price Cuts: Deepnote Versus Cocalc

Deepnote is one of CoCalc's direct competitors. Today (November 30, 2023) they announced a major price cut on their pay-as-you-go rates:

"As you may have already heard, starting December 1, we're slashing the pay-as-you-go rates across all our machines – making them more budget-friendly without any hidden terms."

At CoCalc, we recently finally launched pay as you go machines, which was one of our main development priorities for 2023. These are fully integrated with CoCalc, and were a huge amount of work to bring to market. I was terrified that Deepnote's major price cuts would make Deepnote a much better deal than CoCalc.

Here is how the Deepnote and CoCalc pricing compares:

Deepnote's New Price CoCalc Standard CoCalc Spot
64GB RAM, 16vCPU $1.54 $0.59 $0.12
128GB RAM, 16vCPU (32 CPU on cocalc) $2.02 $1.17 $0.23
K80 GPU (newer L4 GPU on cocalc) $2.02 $0.93 $0.30

Conclusion: CoCalc's prices are still highly competitive, even in light of Deepnote's major price cuts.

Also, spot instances do work very well for many applications.

ListenData 2023-11-28 14:46:00

How to Get Unique Values in a Column in Pandas DataFrame

This tutorial explains how to get unique values from a column in Pandas DataFrame, along with examples.

Find Unique Values in a Column
To read this article in full, please click here
This post appeared first on ListenData
scikit-learn Blog 2023-11-27 00:00:00

My mentored internship at scikit-learn

Author: Stefanie Senger , François Goupil
Quansight Labs 2023-11-24 00:00:00

Unlocking C-level performance in pandas.DataFrame.apply with Numba

A quick overview of the new Numba engine in DataFrame.apply
Quansight Labs 2023-11-23 00:00:00

Improving the interpolation and signal processing capabilities of CuPy

We are excited to spread the news about the improvements that have been taking place in CuPy, where 18 interpolation and more than 100 signal processing parallel GPU APIs are now available as part of a EOSS4 CZI grant.
Keep the gradient flowing 2023-11-18 23:00:00

Optimization Nuggets: Stochastic Polyak Step-size, Part 2

This blog post discusses the convergence rate of the Stochastic Gradient Descent with Stochastic Polyak Step-size (SGD-SPS) algorithm for minimizing a finite sum objective. Building upon the proof of the previous post, we show that the convergence rate can be improved to O(1/t) under the additional assumption that … 2023-11-14 15:30:20

How to Visualize Deep Learning Models

Deep learning models are typically highly complex. While many traditional machine learning models make do with just a couple of hundreds of parameters, deep learning models have millions or billions of parameters. The large language model GPT-4 that OpenAI released in the spring of 2023 is rumored to have nearly 2 trillion parameters. It goes…
ListenData 2023-11-11 22:13:00

NumPy argmin() Function : Learn with Examples

In this tutorial, we will see how to use the NumPy argmin() function in Python along with examples.

To read this article in full, please click here
This post appeared first on ListenData
Quansight Labs 2023-11-08 00:00:00

The 'eu' in eucatastrophe – Why SciPy builds for Python 3.12 on Windows are a minor miracle

Moving SciPy to Meson meant finding a different Fortran compiler on Windows, which was particularly tricky to pull off for conda-forge. This blog tells the story about how things looked pretty grim for the Python 3.12 release, and how things ended up working out just in the nick of time.
Quansight Labs 2023-11-08 00:00:00

Adding support for polynomials to Numba

My work was focused on improving NumPy support in Numba, with focus on the polynomial package.
Quansight Labs 2023-11-08 00:00:00

Refining NumPy's Python API for its 2.0 release

A journey through NumPy's Python API from a maintenance perspective.
ListenData 2023-11-06 09:54:00

NumPy argmax() Function : Learn with Examples

In this tutorial, we will see how to use the NumPy argmax() function in Python along with examples.

The numpy.argmax() function in Python is used to find the indices of the maximum element in an array.

Syntax of NumPy argmax() Function

Below is the syntax of the NumPy argmax() function:

import numpy as np
np.argmax(array, axis, out)
To read this article in full, please click here
This post appeared first on ListenData
Quansight Labs 2023-10-31 00:00:00

Improving SymPy's Documentation

SymPy's documentation has received many significant improvements over the past two years thanks to funding by the Chan Zuckerberg Initiative.
Quansight Labs 2023-10-30 00:00:00

Doctesting for PyData Libraries

The journey of a PyData Newbie
Quansight Labs 2023-10-30 00:00:00

Integrating Hypothesis into SymPy

Gives an introduction to the utility of hypothesis in SymPy 2023-10-20 11:42:04

How to Use Exploratory Notebooks [Best Practices]

Jupyter notebooks have been one of the most controversial tools in the data science community. There are some outspoken critics, as well as passionate fans. Nevertheless, many data scientists will agree that they can be really valuable – if used well. And that’s what we’re going to focus on in this article, which is the…
ListenData 2023-10-12 13:59:00

How to Install PyTorch on Windows

This tutorial explains the steps to install PyTorch on Windows.

PyTorch is a free and open source machine learning library developed by Facebook's AI Research lab. It is built on the Torch library and is mainly used for tasks like computer vision and natural language processing (NLP).

To read this article in full, please click here
This post appeared first on ListenData
Quansight Labs 2023-10-04 00:00:00

The Array API Standard in SciPy

How can SciPy use the Array API Standard to achieve array library interoperability? 2023-10-03 08:58:29

Learnings From Building the ML Platform at Mailchimp

This article was originally an episode of the ML Platform Podcast, a show where Piotr Niedźwiedź and Aurimas Griciūnas, together with ML platform professionals, discuss design choices, best practices, example tool stacks, and real-world learnings from some of the best ML platform professionals. In this episode, Mikiko Bazeley shares her learnings from building the ML…
Keep the gradient flowing 2023-09-28 22:00:00

Optimization Nuggets: Stochastic Polyak Step-size

The stochastic Polyak step-size (SPS) is a practical variant of the Polyak step-size for stochastic optimization. In this blog post, we'll discuss the algorithm and provide a simple analysis for convex objectives with bounded gradients.

Quansight Labs 2023-09-20 00:00:00

Bridging Data Science Tools with PyTorch-Ignite's Code-Generator and Nebari

A summary of my contributions to the Code-Generator project and PyTorch-Ignite ecosystem in the past few months as Quansight Labs intern and my learnings in the process.
Quansight Labs 2023-09-19 00:00:00

Array API Support in scikit-learn

In this blog post, we share how scikit-learn enabled support for the Array API Standard.
scikit-learn Blog 2023-09-10 00:00:00

scikit-learn 2023 In-person Developer Sprint in Paris, France

Author: Reshama Shaikh , François Goupil 2023-09-07 08:15:37

Software Engineering Patterns for Machine Learning

Have you ever talked to your Front-end or Back-end engineer peers and noticed how much they care about code quality? Writing legible, reusable, and efficient code has always been a challenge in the software development community. Endless conversations happen every day across Github pull requests and Slack threads around this topic. How to best adapt… 2023-08-11 13:15:44

ML Pipeline Architecture Design Patterns (With 10 Real-World Examples)

There comes a time when every ML practitioner realizes that training a model in Jupyter Notebook is just one small part of the entire project. Getting a workflow ready which takes your data from its raw form to predictions while maintaining responsiveness and flexibility is the real deal. At that point, the Data Scientists or…
ListenData 2023-08-08 16:38:00

How to Run Windscribe VPN in Windows with Python

In this tutorial, we will show you how to run Windscribe VPN in Windows using Python Code. Windscribe is a popular VPN service that offers several features. Windscribe's free version maintains the same speed as the paid plans.

To read this article in full, please click here
This post appeared first on ListenData
ListenData 2023-08-08 14:52:00

How to Run Proton VPN in Windows with Python

In this tutorial, we will show you how to run Proton VPN in Windows using Python Code.


First you need to download and install the OpenVPN GUI. OpenVPN GUI is a user-friendly application that allows you to easily configure and manage OpenVPN connections on your computer. OpenVPN is a popular open-source VPN protocol that provides secure and encrypted connections over public networks.

To read this article in full, please click here
This post appeared first on ListenData 2023-08-04 14:10:10

Organizing ML Monorepo With Pants

Have you ever copy-pasted chunks of utility code between projects, resulting in multiple versions of the same code living in different repositories? Or, perhaps, you had to make pull requests to tens of projects after the name of the GCP bucket in which you store your data was updated? Situations described above arise way too… 2023-08-03 11:24:14

Learnings From Building the ML Platform at Stitch Fix

This article was originally an episode of the ML Platform Podcast, a show where Piotr Niedźwiedź and Aurimas Griciūnas, together with ML platform professionals, discuss design choices, best practices, example tool stacks, and real-world learnings from some of the best ML platform professionals. In this episode, Stefan Krawczyk shares his learnings from building the ML…
Filipe Saraiva's blog 2023-07-30 14:46:19

Mestrado em Ciência da Computação 2023.2 na UFPA: PLN e Metaheurísticas

Estamos com mais um processo seletivo para o Mestrado em Ciência da Computação na UFPA aberto, com entrada para agora em agosto de 2023. Dessa vez continuo procurando candidatos e candidatas que queiram desenvolver pesquisas na área de metaheurísticas, para quaisquer problemas combinatoriais que queiram aplicar. Esse ainda é um campo muito vasto e tenho… Continue a ler »Mestrado em Ciência da Computação 2023.2 na UFPA: PLN e Metaheurísticas 2023-07-18 11:20:16

Deploying Conversational AI Products to Production With Jason Flaks

This article was originally an episode of the MLOps Live, an interactive Q&A session where ML practitioners answer questions from other ML practitioners.  Every episode is focused on one specific ML topic, and during this one, we talked to Jason Falks about deploying conversational AI products to production. You can watch it on YouTube: Or…
ListenData 2023-07-04 18:10:00

How to Use ChatGPT for Data Science

In this article, we will explore how you, as a data scientist, can use ChatGPT to enhance your data science projects. ChatGPT is a powerful tool that can help you in various aspects of your work, from exploring and analyzing data to generating insights and helping you with coding and troubleshooting. It can also help you to learn data science faster.

To read this article in full, please click here
This post appeared first on ListenData
Quansight Labs 2023-06-28 00:00:00

PyCon US 2023 - An action-packed week

In this post I'm sharing my experience of traveling to the US for PyCon US 2023 2023-06-27 14:22:37

How to Use SHAP Values to Optimize and Debug ML Models

Picture this, you’ve dedicated countless hours to training and fine-tuning your model, meticulously analyzing mountains of data. Yet, you lack a clear understanding of the factors influencing its predictions and, as a result, find it hard to improve it further.  If you have ever found yourself in such a situation, trying to make sense of… 2023-06-27 09:36:21

MLOps Landscape in 2023: Top Tools and Platforms

As you delve into the landscape of MLOps in 2023, you will find a plethora of tools and platforms that have gained traction and are shaping the way models are developed, deployed, and monitored. To provide you with a comprehensive overview, this article explores the key players in the MLOps and FMOps (or LLMOps) ecosystems,…
Quansight Labs 2023-06-27 00:00:00

Numba Dynamic Exceptions

In the following blogpost, we will explore the newly added feature in Numba: Dynamic exception support. We will discuss the previous limitations and explain how Numba was enhanced to handle runtime exceptions.
ListenData 2023-06-19 14:32:00

How to build ChatGPT Clone in Python

In this article, we will see the steps involved in building a chat application and an answering bot in Python using the ChatGPT API and gradio.

Developing a chat application in Python provides more control and flexibility over the ChatGPT website. You can customize and extend the chat application as per your needs. It also help you to integrate with your existing systems and other APIs.

To read this article in full, please click here
This post appeared first on ListenData
Keep the gradient flowing 2023-06-13 22:00:00

On the Convergence of the Unadjusted Langevin Algorithm

The Langevin algorithm is a simple and powerful method to sample from a probability distribution. It's a key ingredient of some machine learning methods such as diffusion models and differentially private learning. In this post, I'll derive a simple convergence analysis of this method in the special case when the …

Spyder Blog 2023-06-08 00:00:00

Spyder gets CZI grant to add remote development features, and a new job opening!

During the last few years, Spyder has positioned itself as a popular data science IDE by combining interactive computing and ease of use with robust programming tools. However, limited remote development support compared to some other IDEs has hindered adoption, as many users would like to work with data and code on high performance computing (HPC) clusters or cloud providers like AWS, GCP or DigitalOcean while developing on their personal computers. Adding such features would open up many new research possibilities by enabling the scientific community to tackle data and compute-intensive programming tasks from the ease and efficiency of their local development environments. Thanks to a two-year grant from the Chan Zuckerberg Initiative, we will be now able to address this shortcoming.

Right now, users have two main options to work remotely using a local IDE (aside from a purely web browser-based approach, which is sometimes not available or desirable): They can either edit and execute their files in a terminal, which is not

(continued...) 2023-06-06 12:40:58

How to Build ML Model Training Pipeline

Hands up if you’ve ever lost hours untangling messy scripts or felt like you’re hunting a ghost while trying to fix that elusive bug, all while your models are taking forever to train. We’ve all been there, right? But now, picture a different scenario: Clean code. Streamlined workflows. Efficient model training. Too good to be…
ListenData 2023-06-06 11:57:00

Transformers Agent: AI Tool That Automates Everything

We have a new AI tool in the market called Transformers Agent which is so powerful that it can automate just about any task you can think of. It can generate and edit images, video, audio, answer questions about documents, convert speech to text and do a lot of other things.

Hugging Face, a well-known name in the open-source AI world, released Transformers Agent that provides a natural language API on top of transformers. The API is designed to be easy to use. With a single line code, it provides a variety of tools for performing natural language tasks, such as question answering, image generation, video generation, text to speech, text classification, and summarization.

To read this article in full, please click here
This post appeared first on ListenData 2023-06-05 13:53:41

What Does GPT-3 Mean For the Future of MLOps? With David Hershey

This article was originally an episode of the MLOps Live, an interactive Q&A session where ML practitioners answer questions from other ML practitioners.  Every episode is focused on one specific ML topic, and during this one, we talked to David Hershey about GPT-3 and the feature of MLOps. You can watch it on YouTube: Or…
ListenData 2023-05-26 09:38:00

Complete Guide to Massively Multilingual Speech (MMS) Model

In this article we have covered everything about the latest multilingual speech model from the basics of how it works to the step-by-step implementation of the model in Python.

Meta, the company that owns Facebook, released a new AI model called Massively Multilingual Speech (MMS) that can convert text to speech and speech to text in over 1,100 languages. It is available for free. It will not only help academicians and researchers across the world but also language preservationists or activists to document and preserve endangered languages to prevent their extinction.

MMS is trained on a large dataset of text and audio in over 1,100 languages. Another best part about the model is that it generates audio which sounds very natural, like human speech. It is also able to identify more than 4,000 spoken languages.

To read this article in full, please click here
This post appeared first on ListenData
Martin Fitzpatrick - python 2023-05-04 09:00:00

PyQt6 Book now available in Korean: 파이썬과 Qt6로 GUI 애플리케이션 만들기 — The hands-on guide to creating GUI applications with Python gets a new translation

I am very happy to announce that my Python GUI programming book Create GUI Applications with Python & Qt6 / PyQt6 Edition …

ListenData 2023-04-19 12:32:00

AutoGPT : Everything You Need To Know

In this post we have covered AutoGPT in detail. By end of this tutorial, you will not only understand how it works but also will be able to run it on your system. Auto-GPT has gained a significant amount of popularity in the media. It has become one of the most talked-about topics across various social media platforms after ChatGPT. It has not only captured the attention of people in Artifical Intelligence community but also people from other background. Media outlets across countries covered it and reported how it can automate everything ranging from simple to complex tasks.

Table of Contents

What is AutoGPT?

AutoGPT is an experimental open-source project built on the latest ChatGPT model i.e GPT-4. It is not limited to ChatGPT as it can also do web search and try to find information from internet. When a client gives us a project with instructions on what to do. We, as analysts, perform tasks to fulfill the project requirements.

ListenData 2023-04-09 08:58:00

Open Source GPT-4 Models Made Easy

In this post we will explain how Open Source GPT-4 Models work and how you can use them as an alternative to a commercial OpenAI GPT-4 solution. Everyday new open source large language models (LLMs) are emerging and the list gets bigger and bigger. We will cover these two models GPT-4 version of Alpaca and Vicuna. This tutorial includes the workings of the models, as well as their implementation with Python

Table of Contents

Vicuna Model Introduction : Vicuna Model

Vicuna was the first open-source model available publicly which is comparable to GPT-4 output. It was fine-tuned on Meta's LLaMA 13B model and conversations dataset collected from ShareGPT. ShareGPT is the website wherein people share their ChatGPT conversations with others.

Important Note : The Vicuna Model was primarily trained on the GPT-3.5 dataset because most of the conversations on ShareGPT during the model's development were based on GPT-3.5. But the model was evaluated based on
Living in an Ivory Basement 2023-04-06 22:00:00

snakemake for doing bioinformatics - inputs and outputs and more!

Slithering your way into bioinformatics with snakemake - inputs and outputs and more!

ListenData 2023-03-30 08:01:00

15 Free Open Source ChatGPT Alternatives (with Code)

In this article we will explain how Open Source ChatGPT alternatives work and how you can use them to build your own ChatGPT clone for free. By the end of this article you will have a good understanding of these models and will be able to compare and use them.

Benefits of Open Source ChatGPT Alternatives

There are various benefits of using open source large language models which are alternatives to ChatGPT. Some of them are listed below.

  1. Data Privacy: Many companies want to have control over data. It is important for them as they don't want any third-party to have access to their data.
  2. Customization: It allows developers to train large language models with their own data and some filtering on some topics if they want to apply
  3. Affordability: Open source GPT models let you to train sophisticated large language models without worrying about expensive hardware.
  4. Democratizing AI: It opens room for further research which can be used for solving real-world problems.
Table of
Martin Fitzpatrick - python 2023-03-20 06:00:00

Getting Started With Git and GitHub in Your Python Projects — Version-Controlling Your Python Projects With Git and GitHub

Using a version control system (VCS) is crucial for any software development project. These systems allow developers to track changes …

ListenData 2023-03-12 07:26:00

Complete Guide to Visual ChatGPT

In this post, we will talk about how to run Visual ChatGPT in Python with Google Colab. ChatGPT has garnered huge popularity recently due to its capability of human style response. As of now, it only provides responses in text format, which means it cannot process, generate or edit images. Microsoft recently released a solution for the same to handle images. Now you can ask ChatGPT to generate or edit the image for you.

Demo of Visual ChatGPT

In the image below, you can see the final output of Visual ChatGPT - how it looks like.

To read this article in full, please click here
This post appeared first on ListenData
Martin Fitzpatrick - python 2023-03-06 06:00:00

Working With Classes in Python — Understanding the Intricacies of Python Classes

Python supports object-oriented programming (OOP) through classes, which allow you to bundle data and behavior in a single entity. Python …

Living in an Ivory Basement 2023-03-02 23:00:00

snakemake for doing bioinformatics - using wildcards to generalize your rules

Slithering your way into bioinformatics with snakemake, wildcard version

Quansight Labs 2023-02-15 00:00:00

Quansight Labs Annual Report 2022: Celebrating Growth and Sustainability in Open Source

Presenting our first annual report! Read about our project achievements, community initiatives, and work culture.
Living in an Ivory Basement 2023-01-22 23:00:00

snakemake for doing bioinformatics - a beginner's guide (part 2)

Slithering your way into bioinformatics with snakemake, round 2.

Living in an Ivory Basement 2023-01-13 23:00:00

snakemake for doing bioinformatics - a beginner's guide (part 1)

Slithering your way into bioinformatics with snakemake

Quansight Labs 2023-01-10 00:00:00

Python packaging & workflows - where to next?

Potential solutions for pain points when dealing with native code; what needs unifying in the Python packaging space, and how should that be approached?
Living in an Ivory Basement 2023-01-07 23:00:00

sourmash has a plugin interface!

Enabling plugins in sourmash, for less directed & more incoherent progress!

Filipe Saraiva's blog 2022-12-15 01:13:41

A obsolescência humana na novela

Passei o dia no trabalho brincando com o ChatGPT, a inteligência artificial para conversas. Travamos diálogos surreais e esdrúxulos: perguntei a ela como seria a América Latina caso tivesse sido colonizada pela Inglaterra e também qual a relação entre Senhor dos Anéis e Game of Thrones. Em outra, pedi que escrevesse um diálogo fictício entre… Continue a ler »A obsolescência humana na novela
Quansight Labs 2022-12-12 00:00:00

Sangho's Internship at Quansight with PyTorch-Ignite project

Blogpost of working on the PyTorch-Ignite project during internship at Quansight
ListenData 2022-12-09 08:31:00

ChatGPT-4 Is a Smart Analyst, Unlike GPT-3.5

ChatGPT has been trending on social media platforms. It has crossed one million users in just a week time. Those who haven't heard about ChatGPT, it's a large language model trained by OpenAI. In simple words, it's a chat bot which answers your questions and the responses it provides may sound human-like. It's an impressive machine learning solution. With the release of GPT-4 we can rely on it over Google search for learning on any topic.

Update: I updated this article with reviews on GPT-4.
Why ChatGPT-3.5 Isn't Smart enough, but GPT-4 is

You can't trust ChatGPT-3.5 for preparation on any certification or exam. It's a Big NO if you think you can refer ChatGPT-3.5 for answering questions in a telephonic interview round. Yes I know it's a cheating if you even use Google for the same but wanted to give a WARNING as many people do this and many social media influencers posted on how to leverage ChatGPT-3.5 for cracking

Quansight Labs 2022-12-05 00:00:00

Conda on Colaboratory

Surbhi Sharma shares her exciting experience working as an intern at Quansight Labs and contributing to condacolab, a tool that lets you deploy a Miniconda installation easily on Google Colab notebooks. This enables you to use conda or mamba to install new packages on any Colab session.
Spyder Blog 2022-11-30 00:00:00

Improvements to the Spyder IDE installation experience

Juan Sebastian Bautista, C.A.M. Gerlach and Carlos Cordoba also contributed to this post.

Spyder 5.4.0 was released recently, featuring some major enhancements to its Windows and macOS standalone installers. You'll now get more detailed feedback when new versions are available, and you can download and start the update to them from right within Spyder, instead of having to install them manually. In this post, we'll go over how these new update features work and how you can start using them!

Before proceeding, we want to acknowledge that this work was made possible by a Small Development Grant awarded to Spyder by NumFOCUS, which has enabled us to hire a new developer (Juan Sebastian Bautista Rojas) to be in charge of all the implementation details.

Before these improvements, Spyder already had a mechanism to detect more recent versions, but that functionality was very simple. There was a pop-up dialog warning that a new version was available, but users had to

scikit-learn Blog 2022-11-30 00:00:00

Interview with Meekail Zain, scikit-learn Team Member

Author: Reshama Shaikh , Meekail zain
Quansight Labs 2022-11-28 00:00:00

Zoom zoom zoom! Improving Accessibility in JupyterLab

Kulsoom Zahra learns about accessibility and fixes a part of the JupyterLab interface (that used to break when zoomed in) during her summer 2022 internship at Quansight Labs.
Spyder Blog 2022-11-18 12:00:00

Introducing the Spyder-Watchlist plugin

Spyder's Variable Explorer is a great tool which aids the development and debugging of Python code by displaying all variables from the current scope. One thing the Variable Explorer is missing is the ability to display the value of arbitrary, user-definable expressions while debugging. For example, it might be useful to see the value of a specific attribute of an object, or the value of an array at some index. Such a feature is known as a "watchlist" or "watches" in other Integrated Development Environments (IDEs). This blog post introduces the Watchlist plugin developed for Spyder.


The watchlist consists of a user-definable list of expressions. They are evaluated after each debugger step, and the result of the evaluation is displayed as a string. This means that value = str(eval(expression)) is performed behind the scenes, and the result is shown in the plugin. The watchlist is a very powerful tool, but this comes at a cost: Any side effect of an expression will affect the execution environment.

Expressions can be

Filipe Saraiva's blog 2022-11-15 02:42:48

Por que abandonamos os blogs?

Interface de escrita do Twitter Estamos nesses dias assistindo o Elon Musk destruir o Twitter. Se espera que nessa dinâmica, ao longo do tempo, a rede social vá perdendo usuários e relevância – isso se não explodir de uma vez, pois seu novo dono fala até em falência. Não é a primeira vez que uma… Continue a ler »Por que abandonamos os blogs?
Quansight Labs 2022-11-15 00:00:00

Making pygments accessible

accessible-pygments hosts curated WCAG-compliant themes for all your syntax highlighting needs.
Quansight Labs 2022-11-15 00:00:00

The new Spyder Editor documentation under the spotlights!

In this blogpost, I share my experience as a Google Season of Docs 2022 technical writer working on updating the Editor user documentation.
Quansight Labs 2022-11-14 00:00:00

Close Encounter with pandas and the Jedis of open source

Learning from awesome mentors and contributing to pandas open source
Quansight Labs 2022-11-10 00:00:00

Quansight Labs awarded three CZI EOSS Cycle 5 Grants

We are delighted to share details about new grants to support the sustainability of SciPy, conda-forge, and CuPy
scikit-learn Blog 2022-11-08 00:00:00

Pandas DataFrame Output for sklearn Transformers

Author: Sangam SwadiK
Quansight Labs 2022-11-07 00:00:00

Developing a Typer CLI for Nebari

The Nebari CLI consists of various commands the user needs to run to initialize, deploy, configure, and update Nebari.
Keep the gradient flowing 2022-10-14 22:00:00

The Russian Roulette: An Unbiased Estimator of the Limit

The idea for what was later called Monte Carlo method occurred to me when I was playing solitaire during my illness.

Stanislaw Ulam, Adventures of a Mathematician

The Russian Roulette offers a simple way to construct an unbiased estimator for the limit of a sequence. It allows for example to …

scikit-learn Blog 2022-10-13 00:00:00

scikit-learn and Hugging Face join forces

Author: Lysandre Debut , François Goupil
scikit-learn Blog 2022-09-29 00:00:00

scikit-learn Sprint in Salta, Argentina

Author: Juan Martín Loyola
Martin Fitzpatrick - python 2022-09-21 09:00:00

Getting started with VS Code for Python — Setting up a Development Environment for Python programming

Setting up a working development environment is the first step for any project. Your development environment setup will determine how …

Keep the gradient flowing 2022-08-25 22:00:00

Notes on the Frank-Wolfe Algorithm, Part III: backtracking line-search

Backtracking step-size strategies (also known as adaptive step-size or approximate line-search) that set the step-size based on a sufficient decrease condition are the standard way to set the step-size on gradient descent and quasi-Newton methods. However, these techniques are much less common for Frank-Wolfe-like algorithms. In this blog post I …

Quansight Labs 2022-08-07 00:00:00

Introducing the 2022 Interns Cohort

Quansight Labs is delighted to welcome its second cohort of 6 interns, who will work on a variety of open source projects and tasks
Spyder Blog 2022-07-25 12:00:00

New 2022 roadmap and grant funding

For the last couple of months, the Spyder team has been working on defining a new roadmap and submitting grant proposals to fund more features and improvements. We are pleased to announce our roadmap for the rest of 2022, and that two proposals were funded!

The roadmap

Considering the importance of sharing a clear perspective of where the Spyder project is going and where we will be focusing our efforts over the coming months, the team has created an initial roadmap for the rest of 2022. We prioritized the highlighted features and enhancements based on input from issues, face-to-face and virtual discussions, Stack Overflow, social media and other feedback, to try to best capture the interests of our users and community.

The proposals

To help make our roadmap achievable, we wrote and submitted proposals to several different venues and organizations in the last couple of months. While we have yet to hear back from some of them, two have already been funded!

The first was for the

Quansight Labs 2022-07-13 00:00:00

SciPy 2022 Accessibility Awareness Programs

Announcing the SciPy 2022 Accessibility Awareness Efforts
ListenData 2022-07-11 16:05:00

Pollution in India : Real-time AQI Data

Air pollution has become a serious problem in recent years across the world. Effects of Air Pollution is devastating and its harmful effects are not just limited to Humans but also animals and plants as well. It also leads to global warming which is esentially increasing air and ocean temperatures around the world.

Indian cities have been topping the list of polluted cities. In order to solve the problem of air pollution the most important thing is to track air pollution on real-time basis first which alerts people to avoid outdoor activities during high air Pollution. This post explains how you can fetch real-time Air Quality Index (AQI) of Indian cities using Python and R code. It allows both Python and R programmers to pull pollution data.

You can download the dataset which contains static information about Indian states, cities and AQI stations. Variables stored in this dataset will be used further to fetch real-time data.

Gaël Varoquaux - programming 2022-07-09 22:00:00

My Mayavi story: discovering open source communities

The Mayavi Python software, and my personal history: A thread on Python and scipy ecosystems, building open source codebase, and meeting really cool and friendly people

I am writing today as a goodbye to the project: I used to be one of the core contributors and maintainers but have been …

ListenData 2022-06-30 14:04:00

Pointwise mutual information (PMI) in NLP

Natural Language Processing (NLP) has secured so much acceptance recently as there are many live projects running and now it's not just limited to academics only. Use cases of NLP can be seen across industries like understanding customers' issues, predicting the next word user is planning to type in the keyboard, automatic text summarization etc. Many researchers across the world trained NLP models in several human languages like English, Spanish, French, Mandarin etc so that benefit of NLP can be seen in every society. In this post we will talk about one of the most useful NLP metric called Pointwise mutual information (PMI) to identify words that can go together along with its implementation in Python and R.

Table of Contents

What is Pointwise mutual information?

PMI helps us to find related words. In other words, it explains how likely the co-occurrence of two words than we would expect by chance. For example the word "Data Science" has a specific meaning when these

Acoular 2022-06-24 05:00:00

How to import your data into Acoular

Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array which is stored in an HDF5 file. This blog post explains how to convert data available in other formats into this file format. As examples for other file formats we will use both .csv (comma separated text files) and .mat (Matlab files).
Quansight Labs 2022-06-06 00:00:00

Checking for accessibility: thoughts and a checklist!

A non-exhaustive but totally honest checklist for accessibility review
Keep the gradient flowing 2022-05-26 22:00:00

On the Link Between Optimization and Polynomials, Part 5

Six: All of this has happened before.
Baltar: But the question remains, does all of this have to happen again?
Six: This time I bet no.
Baltar: You know, I've never known you to play the optimist. Why the change of heart?
Six: Mathematics. Law of averages. Let a complex …

scikit-learn Blog 2022-05-22 00:00:00

Interview with Norbert Preining, scikit-learn Team Member

Author: Reshama Shaikh , Norbert Preining
Martin Fitzpatrick - python 2022-05-19 09:00:00

PyQt6, PySide6, PyQt5 and PySide2 Books -- updated for 2022! — New editions extended and updated, now 780+ pages

Hello! Today I have released new digital editions of my PyQt5, PyQt6, PySide2 and PySide6 book Create GUI Applications with …

ListenData 2022-05-06 11:06:00

Only size-1 arrays can be converted to Python scalars

Numpy is one of the most used module in Python and it is used in a variety of tasks ranging from creating array to mathematical and statistical calculations. Numpy also bring efficiency in Python programming. While using numpy you may encounter this error TypeError: only size-1 arrays can be converted to Python scalars It is one of the frequently appearing error and sometimes it becomes a daunting challenge to solve it.
Meaning : Only Size 1 Arrays Can Be Converted To Python Scalars Error This error generally appears when Python expects a single value but you passed an array which consists of multiple values. For example : you want to calculate exponential value of an array but the function for exponential value was designed for scalar variable (which means single value). When you pass numpy array in the function, it will return this error. This error handling is to prevent your code to process further and avoids unexpected output from the (continued...)
Quansight Labs 2022-05-03 00:00:00

The evolution of the SciPy developer CLI

The development story of a developer command-line interface (CLI) for the SciPy project, with exmaples
Living in an Ivory Basement 2022-04-21 22:00:00

Storing 64-bit unsigned integers in SQLite databases, for fun and profit

Storing unsigned longs in SQLite is possible, and can be fast.