Planet SciPy
Major Price Cuts: Deepnote Versus Cocalc --- Compute Server Pricing
Major Price Cuts: Deepnote Versus CocalcDeepnote is one of CoCalc's direct competitors. Today (November 30, 2023) they announced a major price cut on their pay-as-you-go rates:
"As you may have already heard, starting December 1, we're slashing the pay-as-you-go rates across all our machines – making them more budget-friendly without any hidden terms."
At CoCalc, we recently finally launched pay as you go machines, which was one of our main development priorities for 2023. These are fully integrated with CoCalc, and were a huge amount of work to bring to market. I was terrified that Deepnote's major price cuts would make Deepnote a much better deal than CoCalc.
Here is how the Deepnote and CoCalc pricing compares:
Deepnote's New Price | CoCalc Standard | CoCalc Spot | |
---|---|---|---|
64GB RAM, 16vCPU | $1.54 | $0.59 | $0.12 |
128GB RAM, 16vCPU (32 CPU on cocalc) | $2.02 | $1.17 | $0.23 |
K80 GPU (newer L4 GPU on cocalc) | $2.02 | $0.93 | $0.30 |
Conclusion: CoCalc's prices are still highly competitive, even in light of Deepnote's major price cuts.
Also, spot instances do work very well for many applications.
(continued...)How to Get Unique Values in a Column in Pandas DataFrame
This tutorial explains how to get unique values from a column in Pandas DataFrame, along with examples.
df['columnName'].unique()To read this article in full, please click here
My mentored internship at scikit-learn
Author: Stefanie Senger , François GoupilUnlocking C-level performance in pandas.DataFrame.apply with Numba
A quick overview of the new Numba engine in DataFrame.applyImproving the interpolation and signal processing capabilities of CuPy
We are excited to spread the news about the improvements that have been taking place in CuPy, where 18 interpolation and more than 100 signal processing parallel GPU APIs are now available as part of a EOSS4 CZI grant.Optimization Nuggets: Stochastic Polyak Step-size, Part 2
This blog post discusses the convergence rate of the Stochastic Gradient Descent with Stochastic Polyak Step-size (SGD-SPS) algorithm for minimizing a finite sum objective. Building upon the proof of the previous post, we show that the convergence rate can be improved to O(1/t) under the additional assumption that …
How to Visualize Deep Learning Models
Deep learning models are typically highly complex. While many traditional machine learning models make do with just a couple of hundreds of parameters, deep learning models have millions or billions of parameters. The large language model GPT-4 that OpenAI released in the spring of 2023 is rumored to have nearly 2 trillion parameters. It goes…NumPy argmin() Function : Learn with Examples
In this tutorial, we will see how to use the NumPy argmin() function in Python along with examples.
The 'eu' in eucatastrophe – Why SciPy builds for Python 3.12 on Windows are a minor miracle
Moving SciPy to Meson meant finding a different Fortran compiler on Windows, which was particularly tricky to pull off for conda-forge. This blog tells the story about how things looked pretty grim for the Python 3.12 release, and how things ended up working out just in the nick of time.Adding support for polynomials to Numba
My work was focused on improving NumPy support in Numba, with focus on the polynomial package.Refining NumPy's Python API for its 2.0 release
A journey through NumPy's Python API from a maintenance perspective.NVIDIA Is A New Sponsor Of The Scikit-Learn consortium at the Inria Foundation
Author: NVIDIA , François GoupilNumPy argmax() Function : Learn with Examples
In this tutorial, we will see how to use the NumPy argmax() function in Python along with examples.
The numpy.argmax() function in Python is used to find the indices of the maximum element in an array.
Syntax of NumPy argmax() FunctionBelow is the syntax of the NumPy argmax() function:
import numpy as np np.argmax(array, axis, out)To read this article in full, please click here
Improving SymPy's Documentation
SymPy's documentation has received many significant improvements over the past two years thanks to funding by the Chan Zuckerberg Initiative.Doctesting for PyData Libraries
The journey of a PyData NewbieIntegrating Hypothesis into SymPy
Gives an introduction to the utility of hypothesis in SymPyHow to Use Exploratory Notebooks [Best Practices]
Jupyter notebooks have been one of the most controversial tools in the data science community. There are some outspoken critics, as well as passionate fans. Nevertheless, many data scientists will agree that they can be really valuable – if used well. And that’s what we’re going to focus on in this article, which is the…How to Install PyTorch on Windows
This tutorial explains the steps to install PyTorch on Windows.
PyTorch is a free and open source machine learning library developed by Facebook's AI Research lab. It is built on the Torch library and is mainly used for tasks like computer vision and natural language processing (NLP).
To read this article in full, please click hereThe Array API Standard in SciPy
How can SciPy use the Array API Standard to achieve array library interoperability?Learnings From Building the ML Platform at Mailchimp
This article was originally an episode of the ML Platform Podcast, a show where Piotr Niedźwiedź and Aurimas Griciūnas, together with ML platform professionals, discuss design choices, best practices, example tool stacks, and real-world learnings from some of the best ML platform professionals. In this episode, Mikiko Bazeley shares her learnings from building the ML…Optimization Nuggets: Stochastic Polyak Step-size
The stochastic Polyak step-size (SPS) is a practical variant of the Polyak step-size for stochastic optimization. In this blog post, we'll discuss the algorithm and provide a simple analysis for convex objectives with bounded gradients.
Bridging Data Science Tools with PyTorch-Ignite's Code-Generator and Nebari
A summary of my contributions to the Code-Generator project and PyTorch-Ignite ecosystem in the past few months as Quansight Labs intern and my learnings in the process.Array API Support in scikit-learn
In this blog post, we share how scikit-learn enabled support for the Array API Standard.scikit-learn 2023 In-person Developer Sprint in Paris, France
Author: Reshama Shaikh , François GoupilSoftware Engineering Patterns for Machine Learning
Have you ever talked to your Front-end or Back-end engineer peers and noticed how much they care about code quality? Writing legible, reusable, and efficient code has always been a challenge in the software development community. Endless conversations happen every day across Github pull requests and Slack threads around this topic. How to best adapt…ML Pipeline Architecture Design Patterns (With 10 Real-World Examples)
There comes a time when every ML practitioner realizes that training a model in Jupyter Notebook is just one small part of the entire project. Getting a workflow ready which takes your data from its raw form to predictions while maintaining responsiveness and flexibility is the real deal. At that point, the Data Scientists or…How to Run Windscribe VPN in Windows with Python
In this tutorial, we will show you how to run Windscribe VPN in Windows using Python Code. Windscribe is a popular VPN service that offers several features. Windscribe's free version maintains the same speed as the paid plans.
To read this article in full, please click hereHow to Run Proton VPN in Windows with Python
In this tutorial, we will show you how to run Proton VPN in Windows using Python Code.
First you need to download and install the OpenVPN GUI. OpenVPN GUI is a user-friendly application that allows you to easily configure and manage OpenVPN connections on your computer. OpenVPN is a popular open-source VPN protocol that provides secure and encrypted connections over public networks.
To read this article in full, please click hereOrganizing ML Monorepo With Pants
Have you ever copy-pasted chunks of utility code between projects, resulting in multiple versions of the same code living in different repositories? Or, perhaps, you had to make pull requests to tens of projects after the name of the GCP bucket in which you store your data was updated? Situations described above arise way too…Learnings From Building the ML Platform at Stitch Fix
This article was originally an episode of the ML Platform Podcast, a show where Piotr Niedźwiedź and Aurimas Griciūnas, together with ML platform professionals, discuss design choices, best practices, example tool stacks, and real-world learnings from some of the best ML platform professionals. In this episode, Stefan Krawczyk shares his learnings from building the ML…Mestrado em Ciência da Computação 2023.2 na UFPA: PLN e Metaheurísticas
Estamos com mais um processo seletivo para o Mestrado em Ciência da Computação na UFPA aberto, com entrada para agora em agosto de 2023. Dessa vez continuo procurando candidatos e candidatas que queiram desenvolver pesquisas na área de metaheurísticas, para quaisquer problemas combinatoriais que queiram aplicar. Esse ainda é um campo muito vasto e tenho… Continue a ler »Mestrado em Ciência da Computação 2023.2 na UFPA: PLN e MetaheurísticasDeploying Conversational AI Products to Production With Jason Flaks
This article was originally an episode of the MLOps Live, an interactive Q&A session where ML practitioners answer questions from other ML practitioners. Every episode is focused on one specific ML topic, and during this one, we talked to Jason Falks about deploying conversational AI products to production. You can watch it on YouTube: Or…How to Use ChatGPT for Data Science
In this article, we will explore how you, as a data scientist, can use ChatGPT to enhance your data science projects. ChatGPT is a powerful tool that can help you in various aspects of your work, from exploring and analyzing data to generating insights and helping you with coding and troubleshooting. It can also help you to learn data science faster.
To read this article in full, please click herePyCon US 2023 - An action-packed week
In this post I'm sharing my experience of traveling to the US for PyCon US 2023How to Use SHAP Values to Optimize and Debug ML Models
Picture this, you’ve dedicated countless hours to training and fine-tuning your model, meticulously analyzing mountains of data. Yet, you lack a clear understanding of the factors influencing its predictions and, as a result, find it hard to improve it further. If you have ever found yourself in such a situation, trying to make sense of…MLOps Landscape in 2023: Top Tools and Platforms
As you delve into the landscape of MLOps in 2023, you will find a plethora of tools and platforms that have gained traction and are shaping the way models are developed, deployed, and monitored. To provide you with a comprehensive overview, this article explores the key players in the MLOps and FMOps (or LLMOps) ecosystems,…Numba Dynamic Exceptions
In the following blogpost, we will explore the newly added feature in Numba: Dynamic exception support. We will discuss the previous limitations and explain how Numba was enhanced to handle runtime exceptions.How to build ChatGPT Clone in Python
In this article, we will see the steps involved in building a chat application and an answering bot in Python using the ChatGPT API and gradio.
Developing a chat application in Python provides more control and flexibility over the ChatGPT website. You can customize and extend the chat application as per your needs. It also help you to integrate with your existing systems and other APIs.
To read this article in full, please click hereOn the Convergence of the Unadjusted Langevin Algorithm
The Langevin algorithm is a simple and powerful method to sample from a probability distribution. It's a key ingredient of some machine learning methods such as diffusion models and differentially private learning. In this post, I'll derive a simple convergence analysis of this method in the special case when the …
Spyder gets CZI grant to add remote development features, and a new job opening!
During the last few years, Spyder has positioned itself as a popular data science IDE by combining interactive computing and ease of use with robust programming tools. However, limited remote development support compared to some other IDEs has hindered adoption, as many users would like to work with data and code on high performance computing (HPC) clusters or cloud providers like AWS, GCP or DigitalOcean while developing on their personal computers. Adding such features would open up many new research possibilities by enabling the scientific community to tackle data and compute-intensive programming tasks from the ease and efficiency of their local development environments. Thanks to a two-year grant from the Chan Zuckerberg Initiative, we will be now able to address this shortcoming.
Right now, users have two main options to work remotely using a local IDE (aside from a purely web browser-based approach, which is sometimes not available or desirable): They can either edit and execute their files in a terminal, which is not
(continued...)How to Build ML Model Training Pipeline
Hands up if you’ve ever lost hours untangling messy scripts or felt like you’re hunting a ghost while trying to fix that elusive bug, all while your models are taking forever to train. We’ve all been there, right? But now, picture a different scenario: Clean code. Streamlined workflows. Efficient model training. Too good to be…Transformers Agent: AI Tool That Automates Everything
We have a new AI tool in the market called Transformers Agent
which is so powerful that it can automate just about any task you can think of. It can generate and edit images, video, audio, answer questions about documents, convert speech to text and do a lot of other things.
Hugging Face, a well-known name in the open-source AI world, released Transformers Agent that provides a natural language API on top of transformers. The API is designed to be easy to use. With a single line code, it provides a variety of tools for performing natural language tasks, such as question answering, image generation, video generation, text to speech, text classification, and summarization.
To read this article in full, please click hereWhat Does GPT-3 Mean For the Future of MLOps? With David Hershey
This article was originally an episode of the MLOps Live, an interactive Q&A session where ML practitioners answer questions from other ML practitioners. Every episode is focused on one specific ML topic, and during this one, we talked to David Hershey about GPT-3 and the feature of MLOps. You can watch it on YouTube: Or…Complete Guide to Massively Multilingual Speech (MMS) Model
In this article we have covered everything about the latest multilingual speech model from the basics of how it works to the step-by-step implementation of the model in Python.
Meta, the company that owns Facebook, released a new AI model called Massively Multilingual Speech (MMS) that can convert text to speech and speech to text in over 1,100 languages. It is available for free. It will not only help academicians and researchers across the world but also language preservationists or activists to document and preserve endangered languages to prevent their extinction.
MMS is trained on a large dataset of text and audio in over 1,100 languages. Another best part about the model is that it generates audio which sounds very natural, like human speech. It is also able to identify more than 4,000 spoken languages.
PyQt6 Book now available in Korean: 파이썬과 Qt6로 GUI 애플리케이션 만들기 — The hands-on guide to creating GUI applications with Python gets a new translation
I am very happy to announce that my Python GUI programming book Create GUI Applications with Python & Qt6 / PyQt6 Edition …
AutoGPT : Everything You Need To Know
In this post we have covered AutoGPT in detail. By end of this tutorial, you will not only understand how it works but also will be able to run it on your system. Auto-GPT has gained a significant amount of popularity in the media. It has become one of the most talked-about topics across various social media platforms after ChatGPT
. It has not only captured the attention of people in Artifical Intelligence community but also people from other background. Media outlets across countries covered it and reported how it can automate everything ranging from simple to complex tasks.
AutoGPT is an experimental open-source project built on the latest ChatGPT model i.e GPT-4. It is not limited to ChatGPT as it can also do web search and try to find information from internet. When a client gives us a project with instructions on what to do. We, as analysts, perform tasks to fulfill the project requirements.
Open Source GPT-4 Models Made Easy
In this post we will explain how Open Source GPT-4 Models work and how you can use them as an alternative to a commercial OpenAI GPT-4 solution. Everyday new open source large language models (LLMs) are emerging and the list gets bigger and bigger. We will cover these two models GPT-4 version of Alpaca
and Vicuna
. This tutorial includes the workings of the models, as well as their implementation with Python
Vicuna was the first open-source model available publicly which is comparable to GPT-4 output. It was fine-tuned on Meta's LLaMA 13B model and conversations dataset collected from ShareGPT. ShareGPT is the website wherein people share their ChatGPT conversations with others.
Important Note : The Vicuna Model was primarily trained on the GPT-3.5 dataset because most of the conversations on ShareGPT during the model's development were based on GPT-3.5. But the model was evaluated based on
snakemake for doing bioinformatics - inputs and outputs and more!
Slithering your way into bioinformatics with snakemake - inputs and outputs and more!
15 Free Open Source ChatGPT Alternatives (with Code)
In this article we will explain how Open Source ChatGPT alternatives work and how you can use them to build your own ChatGPT clone for free. By the end of this article you will have a good understanding of these models and will be able to compare and use them.
There are various benefits of using open source large language models which are alternatives to ChatGPT. Some of them are listed below.
- Data Privacy: Many companies want to have control over data. It is important for them as they don't want any third-party to have access to their data.
- Customization: It allows developers to train large language models with their own data and some filtering on some topics if they want to apply
- Affordability: Open source GPT models let you to train sophisticated large language models without worrying about expensive hardware.
- Democratizing AI: It opens room for further research which can be used for solving real-world problems.
Getting Started With Git and GitHub in Your Python Projects — Version-Controlling Your Python Projects With Git and GitHub
Using a version control system (VCS) is crucial for any software development project. These systems allow developers to track changes …
Complete Guide to Visual ChatGPT
In this post, we will talk about how to run Visual ChatGPT in Python with Google Colab. ChatGPT has garnered huge popularity recently due to its capability of human style response. As of now, it only provides responses in text format, which means it cannot process, generate or edit images. Microsoft recently released a solution for the same to handle images. Now you can ask ChatGPT to generate or edit the image for you.
In the image below, you can see the final output of Visual ChatGPT - how it looks like.
Working With Classes in Python — Understanding the Intricacies of Python Classes
Python supports object-oriented programming (OOP) through classes, which allow you to bundle data and behavior in a single entity. Python …
snakemake for doing bioinformatics - using wildcards to generalize your rules
Slithering your way into bioinformatics with snakemake, wildcard version
Quansight Labs Annual Report 2022: Celebrating Growth and Sustainability in Open Source
Presenting our first annual report! Read about our project achievements, community initiatives, and work culture.conda & mamba on shared clusters works better now!
conda is great!
A brief overview of automation and parallelization options in UNIX/on an HPC
Automating things! Parallelizing them!
snakemake for doing bioinformatics - a beginner's guide (part 2)
Slithering your way into bioinformatics with snakemake, round 2.
snakemake for doing bioinformatics - a beginner's guide (part 1)
Slithering your way into bioinformatics with snakemake
Python packaging & workflows - where to next?
Potential solutions for pain points when dealing with native code; what needs unifying in the Python packaging space, and how should that be approached?sourmash has a plugin interface!
Enabling plugins in sourmash, for less directed & more incoherent progress!
Reading "Orwell's Roses" by Rebecca Solnit
This is a good book!
A obsolescência humana na novela
Passei o dia no trabalho brincando com o ChatGPT, a inteligência artificial para conversas. Travamos diálogos surreais e esdrúxulos: perguntei a ela como seria a América Latina caso tivesse sido colonizada pela Inglaterra e também qual a relação entre Senhor dos Anéis e Game of Thrones. Em outra, pedi que escrevesse um diálogo fictício entre… Continue a ler »A obsolescência humana na novelaSangho's Internship at Quansight with PyTorch-Ignite project
Blogpost of working on the PyTorch-Ignite project during internship at QuansightChatGPT-4 Is a Smart Analyst, Unlike GPT-3.5
ChatGPT has been trending on social media platforms. It has crossed one million users in just a week time. Those who haven't heard about ChatGPT, it's a large language model trained by OpenAI. In simple words, it's a chat bot which answers your questions and the responses it provides may sound human-like. It's an impressive machine learning solution. With the release of GPT-4 we can rely on it over Google search for learning on any topic.
Update: I updated this article with reviews on GPT-4.You can't trust ChatGPT-3.5 for preparation on any certification or exam. It's a Big NO if you think you can refer ChatGPT-3.5 for answering questions in a telephonic interview round. Yes I know it's a cheating if you even use Google for the same but wanted to give a WARNING as many people do this and many social media influencers posted on how to leverage ChatGPT-3.5 for cracking
(continued...)Conda on Colaboratory
Surbhi Sharma shares her exciting experience working as an intern at Quansight Labs and contributing to condacolab, a tool that lets you deploy a Miniconda installation easily on Google Colab notebooks. This enables you to use conda or mamba to install new packages on any Colab session.Improvements to the Spyder IDE installation experience
Juan Sebastian Bautista, C.A.M. Gerlach and Carlos Cordoba also contributed to this post.
Spyder 5.4.0 was released recently, featuring some major enhancements to its Windows and macOS standalone installers. You'll now get more detailed feedback when new versions are available, and you can download and start the update to them from right within Spyder, instead of having to install them manually. In this post, we'll go over how these new update features work and how you can start using them!
Before proceeding, we want to acknowledge that this work was made possible by a Small Development Grant awarded to Spyder by NumFOCUS, which has enabled us to hire a new developer (Juan Sebastian Bautista Rojas) to be in charge of all the implementation details.
Before these improvements, Spyder already had a mechanism to detect more recent versions, but that functionality was very simple. There was a pop-up dialog warning that a new version was available, but users had to
(continued...)Interview with Meekail Zain, scikit-learn Team Member
Author: Reshama Shaikh , Meekail zainZoom zoom zoom! Improving Accessibility in JupyterLab
Kulsoom Zahra learns about accessibility and fixes a part of the JupyterLab interface (that used to break when zoomed in) during her summer 2022 internship at Quansight Labs.Introducing the Spyder-Watchlist plugin
Spyder's Variable Explorer is a great tool which aids the development and debugging of Python code by displaying all variables from the current scope. One thing the Variable Explorer is missing is the ability to display the value of arbitrary, user-definable expressions while debugging. For example, it might be useful to see the value of a specific attribute of an object, or the value of an array at some index. Such a feature is known as a "watchlist" or "watches" in other Integrated Development Environments (IDEs). This blog post introduces the Watchlist plugin developed for Spyder.
FeaturesThe watchlist consists of a user-definable list of expressions.
They are evaluated after each debugger step, and the result of the evaluation is displayed as a string.
This means that value = str(eval(expression))
is performed behind the scenes, and the result is shown in the plugin.
The watchlist is a very powerful tool, but this comes at a cost: Any side effect of an expression will affect the execution environment.
Expressions can be
(continued...)Por que abandonamos os blogs?
Interface de escrita do Twitter Estamos nesses dias assistindo o Elon Musk destruir o Twitter. Se espera que nessa dinâmica, ao longo do tempo, a rede social vá perdendo usuários e relevância – isso se não explodir de uma vez, pois seu novo dono fala até em falência. Não é a primeira vez que uma… Continue a ler »Por que abandonamos os blogs?Making pygments accessible
accessible-pygments hosts curated WCAG-compliant themes for all your syntax highlighting needs.The new Spyder Editor documentation under the spotlights!
In this blogpost, I share my experience as a Google Season of Docs 2022 technical writer working on updating the Editor user documentation.Close Encounter with pandas and the Jedis of open source
Learning from awesome mentors and contributing to pandas open sourceQuansight Labs awarded three CZI EOSS Cycle 5 Grants
We are delighted to share details about new grants to support the sustainability of SciPy, conda-forge, and CuPyPandas DataFrame Output for sklearn Transformers
Author: Sangam SwadiKDeveloping a Typer CLI for Nebari
The Nebari CLI consists of various commands the user needs to run to initialize, deploy, configure, and update Nebari.The Russian Roulette: An Unbiased Estimator of the Limit
The idea for what was later called Monte Carlo method occurred to me when I was playing solitaire during my illness.Stanislaw Ulam, Adventures of a Mathematician
The Russian Roulette offers a simple way to construct an unbiased estimator for the limit of a sequence. It allows for example to …
scikit-learn and Hugging Face join forces
Author: Lysandre Debut , François Goupilscikit-learn Sprint in Salta, Argentina
Author: Juan Martín LoyolaGetting started with VS Code for Python — Setting up a Development Environment for Python programming
Setting up a working development environment is the first step for any project. Your development environment setup will determine how …
So! You want to search all the public metagenomes with a genome sequence!
Searching all the things - faster!
Notes on the Frank-Wolfe Algorithm, Part III: backtracking line-search
Backtracking step-size strategies (also known as adaptive step-size or approximate line-search) that set the step-size based on a sufficient decrease condition are the standard way to set the step-size on gradient descent and quasi-Newton methods. However, these techniques are much less common for Frank-Wolfe-like algorithms. In this blog post I …
Introducing the 2022 Interns Cohort
Quansight Labs is delighted to welcome its second cohort of 6 interns, who will work on a variety of open source projects and tasksNew 2022 roadmap and grant funding
For the last couple of months, the Spyder team has been working on defining a new roadmap and submitting grant proposals to fund more features and improvements. We are pleased to announce our roadmap for the rest of 2022, and that two proposals were funded!
The roadmapConsidering the importance of sharing a clear perspective of where the Spyder project is going and where we will be focusing our efforts over the coming months, the team has created an initial roadmap for the rest of 2022. We prioritized the highlighted features and enhancements based on input from issues, face-to-face and virtual discussions, Stack Overflow, social media and other feedback, to try to best capture the interests of our users and community.
The proposalsTo help make our roadmap achievable, we wrote and submitted proposals to several different venues and organizations in the last couple of months. While we have yet to hear back from some of them, two have already been funded!
The first was for the
(continued...)SciPy 2022 Accessibility Awareness Programs
Announcing the SciPy 2022 Accessibility Awareness EffortsThe Value of Open Source Sprints, the scikit-learn Experience
Author: Reshama ShaikhPollution in India : Real-time AQI Data
Air pollution has become a serious problem in recent years across the world. Effects of Air Pollution is devastating and its harmful effects are not just limited to Humans but also animals and plants as well. It also leads to global warming which is esentially increasing air and ocean temperatures around the world.
Indian cities have been topping the list of polluted cities. In order to solve the problem of air pollution the most important thing is to track air pollution on real-time basis first which alerts people to avoid outdoor activities during high air Pollution. This post explains how you can fetch real-time Air Quality Index (AQI) of Indian cities using Python and R code. It allows both Python and R programmers to pull pollution data.
You can download the dataset which contains static information about Indian states, cities and AQI stations. Variables stored in this dataset will be used further to fetch real-time data.
(continued...)
My Mayavi story: discovering open source communities
The Mayavi Python software, and my personal history: A thread on Python and scipy ecosystems, building open source codebase, and meeting really cool and friendly people
I am writing today as a goodbye to the project: I used to be one of the core contributors and maintainers but have been …
Pointwise mutual information (PMI) in NLP
Natural Language Processing (NLP) has secured so much acceptance recently as there are many live projects running and now it's not just limited to academics only. Use cases of NLP can be seen across industries like understanding customers' issues, predicting the next word user is planning to type in the keyboard, automatic text summarization etc. Many researchers across the world trained NLP models in several human languages like English, Spanish, French, Mandarin etc so that benefit of NLP can be seen in every society. In this post we will talk about one of the most useful NLP metric called Pointwise mutual information (PMI) to identify words that can go together along with its implementation in Python and R.
PMI helps us to find related words. In other words, it explains how likely the co-occurrence of two words than we would expect by chance. For example the word "Data Science" has a specific meaning when these
How to import your data into Acoular
Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array which is stored in an HDF5 file. This blog post explains how to convert data available in other formats into this file format. As examples for other file formats we will use both .csv (comma separated text files) and .mat (Matlab files).Checking for accessibility: thoughts and a checklist!
A non-exhaustive but totally honest checklist for accessibility reviewOn the Link Between Optimization and Polynomials, Part 5
Six: All of this has happened before.
Baltar: But the question remains, does all of this have to happen again?
Six: This time I bet no.
Baltar: You know, I've never known you to play the optimist. Why the change of heart?
Six: Mathematics. Law of averages. Let a complex …
Announcing ribbity - a hacky project to build Web sites from GitHub issue trackers
Munging GitHub issue trackers for fun!
Interview with Norbert Preining, scikit-learn Team Member
Author: Reshama Shaikh , Norbert PreiningPyQt6, PySide6, PyQt5 and PySide2 Books -- updated for 2022! — New editions extended and updated, now 780+ pages
Hello! Today I have released new digital editions of my PyQt5, PyQt6, PySide2 and PySide6 book Create GUI Applications with …
5 Years, 10 Sprints, A scikit-learn Open Source Journey
Author: Reshama ShaikhOnly size-1 arrays can be converted to Python scalars
Numpy is one of the most used module in Python and it is used in a variety of tasks ranging from creating array to mathematical and statistical calculations. Numpy also bring efficiency in Python programming. While using numpy you may encounter this errorTypeError: only size-1 arrays can be converted to Python scalars
It is one of the frequently appearing error and sometimes it becomes a daunting challenge to solve it.
Meaning : Only Size 1 Arrays Can Be Converted To Python Scalars Error
This error generally appears when Python expects a single value but you passed an array which consists of multiple values.
For example : you want to calculate exponential value of an array but the function for exponential value was designed for scalar variable (which means single value). When you pass numpy array in the function, it will return this error. This error handling is to prevent your code to process further and avoids unexpected output from the (continued...)
The evolution of the SciPy developer CLI
The development story of a developer command-line interface (CLI) for the SciPy project, with exmaplesThe second Common Fund Data Ecosystem hackathon - May 9-13, 2022!
We're running another hackathon!
Storing 64-bit unsigned integers in SQLite databases, for fun and profit
Storing unsigned longs in SQLite is possible, and can be fast.