Building ML Platform in Retail and eCommerceGetting machine learning to solve some of the hardest problems in an organization is great. And eCommerce companies have a ton of use cases where ML can help. The problem is, with more ML models and systems in production, you need to set up more infrastructure to reliably manage everything. And because of that, many…
Complete Guide to Massively Multilingual Speech (MMS) Model
In this article we have covered everything about the latest multilingual speech model from the basics of how it works to the step-by-step implementation of the model in Python.
Meta, the company that owns Facebook, released a new AI model called Massively Multilingual Speech (MMS) that can convert text to speech and speech to text in over 1,100 languages. It is available for free. It will not only help academicians and researchers across the world but also language preservationists or activists to document and preserve endangered languages to prevent their extinction.
MMS is trained on a large dataset of text and audio in over 1,100 languages. Another best part about the model is that it generates audio which sounds very natural, like human speech. It is also able to identify more than 4,000 spoken languages.READ MORE »
How to Build ETL Data Pipeline in MLFrom data processing to quick insights, robust pipelines are a must for any ML system. Often the Data Team, comprising Data and ML Engineers, needs to build this infrastructure, and this experience can be painful. However, efficient use of ETL pipelines in ML can help make their life much easier. This article explores the importance…
How to Save Trained Model in PythonWhen working on real-world machine learning (ML) use cases, finding the best algorithm/model is not the end of your responsibilities. It is crucial to save, store, and package these models for their future use and deployment to production. These practices are needed for a number of reasons: To reiterate, while saving and storing ML models…
How to Build an End-To-End ML PipelineOne of the most prevalent complaints we hear from ML engineers in the community is how costly and error-prone it is to manually go through the ML workflow of building and deploying models. They run scripts manually to preprocess their training data, rerun the deployment scripts, manually tune their models, and spend their working hours…
The Importance of High-Quality Labeled Data
The key to unlocking the power of machine learning (ML) lies in having high-quality labeled data. In this email, we’ll explore the significance of labeled data, its impact on the performance of ML models, and how you can capitalize on this natural resource of the modern age to drive your ... Read more
The post The Importance of High-Quality Labeled Data appeared first on Sparrow Computing.
Predictive Maintenance at General Electric
As you think through the ways machine learning (ML) can be used to accelerate your business, it can be helpful to see how other companies have done it. Today, I want to share an example of how General Electric (GE) harnessed the power of ML to transform a large part ... Read more
The post Predictive Maintenance at General Electric appeared first on Sparrow Computing.
Building and Deploying CV Models: Lessons Learned From Computer Vision EngineerWith over 3 years of experience in designing, building, and deploying computer vision (CV) models, I’ve realized people don’t focus enough on crucial aspects of building and deploying such complex systems. In this blog post, I’ll share my own experiences and the hard-won insights I’ve gained from designing, building, and deploying cutting-edge CV models across…
AutoGPT Explained: Everything You Need To Know
In this post we have covered AutoGPT in detail. By end of this tutorial, you will not only understand how it works but also will be able to run it on your system. Auto-GPT has gained a significant amount of popularity in the media. It has become one of the most talked-about topics across various social media platforms after
ChatGPT. It has not only captured the attention of people in Artifical Intelligence community but also people from other background. Media outlets across countries covered it and reported how it can automate everything ranging from simple to complex tasks.
AutoGPT is an experimental open-source project built on the latest ChatGPT model i.e GPT-4. It is not limited to ChatGPT as it can also do web search and try to find information from internet. When a client gives us a project with instructions on what to do. We, as analysts, perform tasks to fulfill the project requirements.
How to Build an Experiment Tracking Tool [Learnings From Engineers Behind Neptune]As an MLOps engineer on your team, you are often tasked with improving the workflow of your data scientists by adding capabilities to your ML platform or by building standalone tools for them to use. Experiment tracking is one such capability. And since you are reading this article, the data scientists you support have probably…
Open Source GPT-4 Models Made Easy
In this post we will explain how Open Source GPT-4 Models work and how you can use them as an alternative to a commercial OpenAI GPT-4 solution. Everyday new open source large language models (LLMs) are emerging and the list gets bigger and bigger. We will cover these two models
GPT-4 version of Alpaca and
Vicuna. This tutorial includes the workings of the models, as well as their implementation with Python
Vicuna was the first open-source model available publicly which is comparable to GPT-4 output. It was fine-tuned on Meta's LLaMA 13B model and conversations dataset collected from ShareGPT. ShareGPT is the website wherein people share their ChatGPT conversations with others.
Important Note : The Vicuna Model was primarily trained on the GPT-3.5 dataset because most of the conversations on ShareGPT during the model's development were based on GPT-3.5. But the model was evaluated based on
snakemake for doing bioinformatics - inputs and outputs and more!
Slithering your way into bioinformatics with snakemake - inputs and outputs and more!
ML Model Packaging [The Ultimate Guide]Have you ever spent weeks or months building a machine learning model, only to later find out that deploying it into a production environment is complicated and time-consuming? Or have you struggled to manage multiple versions of a model and keep track of all the dependencies and configurations required for deployment? If you’re nodding your…
How to Label Data for Machine Learning
Machine learning has revolutionized the world of technology, playing a crucial role in various applications, from self-driving cars and facial recognition systems to language translation and sentiment analysis. The success of machine learning models largely depends on the quality and quantity of data they are trained on. In particular, labeled ... Read more
The post How to Label Data for Machine Learning appeared first on Sparrow Computing.
Real-World MLOps Examples: End-To-End MLOps Pipeline for Visual Search at BrainlyIn this second installment of the series “Real-world MLOps Examples,” Paweł Pęczek, Machine Learning Engineer at Brainly, will walk you through the end-to-end Machine Learning Operations (MLOps) process in the Visual Search team at Brainly. And because it takes more than technologies and processes to succeed with MLOps, he will also share details on: Enjoy…
14 Open Source Alternatives to ChatGPT - Build Your Own Clone for Free
In this article we will explain how Open Source ChatGPT alternatives work and how you can run them to build your own ChatGPT clone for free. We will introduce you to fourteen powerful open source alternatives to ChatGPT, such as GPT4All, Dolly 2, Vicuna, Alpaca GPT-4. We have provided Python code for each of these models so you can run them with ease in Python. By the end of this article you will have a good understanding of these models and will be able to compare and use them according to your requirements.
ChatGPT is not open source. It has had two recent popular releases GPT-3.5 and GPT-4. GPT-4 has major improvements over GPT-3.5 and is more accurate in producing responses. ChatGPT does not allow you to view or modify the source code as it is not publicly available. Hence there is a need for the models which are open source and available for free. By using these open source(continued...)
Understanding the Data Science Process for Entrepreneurs
As an entrepreneur looking to harness the power of machine learning (ML) in your business, understanding the data science process is crucial. This process can be broken down into three main steps: The goal is to move through these stages as quickly as possible so that you can gather feedback ... Read more
The post Understanding the Data Science Process for Entrepreneurs appeared first on Sparrow Computing.
Deploying Large NLP Models: Infrastructure Cost OptimizationNLP models in commercial applications such as text generation systems have experienced great interest among the user. These models have achieved various groundbreaking results in many NLP tasks like question-answering, summarization, language translation, classification, paraphrasing, et cetera. Models like for example ChatGPT, Gopher **(280B), GPT-3 (175B), Jurassic-1 (178B), and Megatron-Turing NLG (530B) are predominantly very…
Building a Machine Learning Platform [Definitive Guide]Moving across the typical machine learning lifecycle can be a nightmare. From gathering and processing data to building models through experiments, deploying the best ones, and managing them at scale for continuous value in production—it’s a lot. As the number of ML-powered apps and services grows, it gets overwhelming for data scientists and ML engineers…
Managing Dataset Versions in Long-Term ML ProjectsLong-term ML project involves developing and sustaining applications or systems that leverage machine learning models, algorithms, and techniques. As a result of the life span of these apps and systems, the ML models associated require to be constantly updated, redeployed, and maintained, which means that they require proper dataset version management. An example of a…
How to Build a CI/CD MLOps Pipeline [Case Study]Based on the McKinsey survey, 56% of orgs today are using machine learning in at least one business function. It’s clear that the need for efficient and effective MLOps and CI/CD practices is becoming increasingly vital. This article is a real-life study of building a CI/CD MLOps pipeline. We’ll delve into the MLOps practices and strategies…
Complete Guide to Visual ChatGPT
In this post, we will talk about how to run Visual ChatGPT in Python with Google Colab. ChatGPT has garnered huge popularity recently due to its capability of human style response. As of now, it only provides responses in text format, which means it cannot process, generate or edit images. Microsoft recently released a solution for the same to handle images. Now you can ask ChatGPT to generate or edit the image for you.
In the image below, you can see the final output of Visual ChatGPT - how it looks like.READ MORE »
snakemake for doing bioinformatics - using wildcards to generalize your rules
Slithering your way into bioinformatics with snakemake, wildcard version
Saving Utility Companies Years with Computer Vision
How do utility companies monitor thousands of miles of electrical wire to find small imperfections that threaten the entire system? For the entire history of electrical infrastructure, the only answer has been ‘very slowly.’ Now, Sparrow’s computer vision capabilities, combined with Fast Forward’s thermal imaging system, can accomplish what used ... Read more
The post Saving Utility Companies Years with Computer Vision appeared first on Sparrow Computing.
conda & mamba on shared clusters works better now!
conda is great!
A brief overview of automation and parallelization options in UNIX/on an HPC
Automating things! Parallelizing them!
snakemake for doing bioinformatics - a beginner's guide (part 2)
Slithering your way into bioinformatics with snakemake, round 2.
snakemake for doing bioinformatics - a beginner's guide (part 1)
Slithering your way into bioinformatics with snakemake
sourmash has a plugin interface!
Enabling plugins in sourmash, for less directed & more incoherent progress!
Reading "Orwell's Roses" by Rebecca Solnit
This is a good book!
A obsolescência humana na novelaPassei o dia no trabalho brincando com o ChatGPT, a inteligência artificial para conversas. Travamos diálogos surreais e esdrúxulos: perguntei a ela como seria a América Latina caso tivesse sido colonizada pela Inglaterra e também qual a relação entre Senhor dos Anéis e Game of Thrones. Em outra, pedi que escrevesse um diálogo fictício entre… Continue a ler »A obsolescência humana na novela
Overview This post is going to showcase the development of a vehicle speed detector using Sparrow Computing’s open-source libraries and PyTorch Lightning. The exciting news here is that we could make this speed detector for any traffic feed without prior knowledge about the site (no calibration required), or specialized imaging ... Read more
The post Speed Trap appeared first on Sparrow Computing.
ChatGPT-4 Is a Smart Analyst, Unlike GPT-3.5
ChatGPT has been trending on social media platforms. It has crossed one million users in just a week time. Those who haven't heard about ChatGPT, it's a large language model trained by OpenAI. In simple words, it's a chat bot which answers your questions and the responses it provides may sound human-like. It's an impressive machine learning solution. With the release of GPT-4 we can rely on it over Google search for learning on any topic.Update: I updated this article with reviews on GPT-4.
You can't trust ChatGPT-3.5 for preparation on any certification or exam. It's a Big NO if you think you can refer ChatGPT-3.5 for answering questions in a telephonic interview round. Yes I know it's a cheating if you even use Google for the same but wanted to give a WARNING as many people do this and many social media influencers posted on how to leverage ChatGPT-3.5 for cracking(continued...)
Improvements to the Spyder IDE installation experience
Juan Sebastian Bautista, C.A.M. Gerlach and Carlos Cordoba also contributed to this post.
Spyder 5.4.0 was released recently, featuring some major enhancements to its Windows and macOS standalone installers. You'll now get more detailed feedback when new versions are available, and you can download and start the update to them from right within Spyder, instead of having to install them manually. In this post, we'll go over how these new update features work and how you can start using them!
Before proceeding, we want to acknowledge that this work was made possible by a Small Development Grant awarded to Spyder by NumFOCUS, which has enabled us to hire a new developer (Juan Sebastian Bautista Rojas) to be in charge of all the implementation details.
Before these improvements, Spyder already had a mechanism to detect more recent versions, but that functionality was very simple. There was a pop-up dialog warning that a new version was available, but users had to(continued...)
Interview with Meekail Zain, scikit-learn Team MemberAuthor: Reshama Shaikh , Meekail zain
Introducing the Spyder-Watchlist plugin
Spyder's Variable Explorer is a great tool which aids the development and debugging of Python code by displaying all variables from the current scope. One thing the Variable Explorer is missing is the ability to display the value of arbitrary, user-definable expressions while debugging. For example, it might be useful to see the value of a specific attribute of an object, or the value of an array at some index. Such a feature is known as a "watchlist" or "watches" in other Integrated Development Environments (IDEs). This blog post introduces the Watchlist plugin developed for Spyder.Features
The watchlist consists of a user-definable list of expressions.
They are evaluated after each debugger step, and the result of the evaluation is displayed as a string.
This means that
value = str(eval(expression)) is performed behind the scenes, and the result is shown in the plugin.
The watchlist is a very powerful tool, but this comes at a cost: Any side effect of an expression will affect the execution environment.
Expressions can be(continued...)
Por que abandonamos os blogs?Interface de escrita do Twitter Estamos nesses dias assistindo o Elon Musk destruir o Twitter. Se espera que nessa dinâmica, ao longo do tempo, a rede social vá perdendo usuários e relevância – isso se não explodir de uma vez, pois seu novo dono fala até em falência. Não é a primeira vez que uma… Continue a ler »Por que abandonamos os blogs?
Pandas DataFrame Output for sklearn TransformersAuthor: Sangam SwadiK
The Russian Roulette: An Unbiased Estimator of the Limit
The idea for what was later called Monte Carlo method occurred to me when I was playing solitaire during my illness.
Stanislaw Ulam, Adventures of a Mathematician
The Russian Roulette offers a simple way to construct an unbiased estimator for the limit of a sequence. It allows for example to …
scikit-learn and Hugging Face join forcesAuthor: Lysandre Debut , François Goupil
scikit-learn Sprint in Salta, ArgentinaAuthor: Juan Martín Loyola
So! You want to search all the public metagenomes with a genome sequence!
Searching all the things - faster!
Notes on the Frank-Wolfe Algorithm, Part III: backtracking line-search
Backtracking step-size strategies (also known as adaptive step-size or approximate line-search) that set the step-size based on a sufficient decrease condition are the standard way to set the step-size on gradient descent and quasi-Newton methods. However, these techniques are much less common for Frank-Wolfe-like algorithms. In this blog post I …
New 2022 roadmap and grant funding
For the last couple of months, the Spyder team has been working on defining a new roadmap and submitting grant proposals to fund more features and improvements. We are pleased to announce our roadmap for the rest of 2022, and that two proposals were funded!The roadmap
Considering the importance of sharing a clear perspective of where the Spyder project is going and where we will be focusing our efforts over the coming months, the team has created an initial roadmap for the rest of 2022. We prioritized the highlighted features and enhancements based on input from issues, face-to-face and virtual discussions, Stack Overflow, social media and other feedback, to try to best capture the interests of our users and community.The proposals
To help make our roadmap achievable, we wrote and submitted proposals to several different venues and organizations in the last couple of months. While we have yet to hear back from some of them, two have already been funded!
The first was for the(continued...)
The Value of Open Source Sprints, the scikit-learn ExperienceAuthor: Reshama Shaikh
Pollution in India : Real-time AQI Data
Air pollution has become a serious problem in recent years across the world. Effects of Air Pollution is devastating and its harmful effects are not just limited to Humans but also animals and plants as well. It also leads to global warming which is esentially increasing air and ocean temperatures around the world.
Indian cities have been topping the list of polluted cities. In order to solve the problem of air pollution the most important thing is to track air pollution on real-time basis first which alerts people to avoid outdoor activities during high air Pollution. This post explains how you can fetch real-time Air Quality Index (AQI) of Indian cities using Python and R code. It allows both Python and R programmers to pull pollution data.
You can download the dataset which contains static information about Indian states, cities and AQI stations. Variables stored in this dataset will be used further to fetch real-time data.
My Mayavi story: discovering open source communities
The Mayavi Python software, and my personal history: A thread on Python and scipy ecosystems, building open source codebase, and meeting really cool and friendly people
I am writing today as a goodbye to the project: I used to be one of the core contributors and maintainers but have been …
Pointwise mutual information (PMI) in NLP
Natural Language Processing (NLP) has secured so much acceptance recently as there are many live projects running and now it's not just limited to academics only. Use cases of NLP can be seen across industries like understanding customers' issues, predicting the next word user is planning to type in the keyboard, automatic text summarization etc. Many researchers across the world trained NLP models in several human languages like English, Spanish, French, Mandarin etc so that benefit of NLP can be seen in every society. In this post we will talk about one of the most useful NLP metric called Pointwise mutual information (PMI) to identify words that can go together along with its implementation in Python and R.
PMI helps us to find related words. In other words, it explains how likely the co-occurrence of two words than we would expect by chance. For example the word "Data Science" has a specific meaning when these
How to import your data into AcoularAcoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array which is stored in an HDF5 file. This blog post explains how to convert data available in other formats into this file format. As examples for other file formats we will use both .csv (comma separated text files) and .mat (Matlab files).
Checking for accessibility: thoughts and a checklist!
On the Link Between Optimization and Polynomials, Part 5
Six: All of this has happened before.
Baltar: But the question remains, does all of this have to happen again?
Six: This time I bet no.
Baltar: You know, I've never known you to play the optimist. Why the change of heart?
Six: Mathematics. Law of averages. Let a complex …
Announcing ribbity - a hacky project to build Web sites from GitHub issue trackers
Munging GitHub issue trackers for fun!
Interview with Norbert Preining, scikit-learn Team MemberAuthor: Reshama Shaikh , Norbert Preining
5 Years, 10 Sprints, A scikit-learn Open Source JourneyAuthor: Reshama Shaikh
Only size-1 arrays can be converted to Python scalarsNumpy is one of the most used module in Python and it is used in a variety of tasks ranging from creating array to mathematical and statistical calculations. Numpy also bring efficiency in Python programming. While using numpy you may encounter this error
TypeError: only size-1 arrays can be converted to Python scalarsIt is one of the frequently appearing error and sometimes it becomes a daunting challenge to solve it. (continued...)
Interview with Lucy Liu, scikit-learn Team MemberAuthor: Reshama Shaikh , Lucy Liu
The second Common Fund Data Ecosystem hackathon - May 9-13, 2022!
We're running another hackathon!
Storing 64-bit unsigned integers in SQLite databases, for fun and profit
Storing unsigned longs in SQLite is possible, and can be fast.
Interview with Maren Westermann: Extending the Impact of the scikit-learn Sprints to the CommunityAuthor: Reshama Shaikh , Maren Westermann
Behind the Scenes of Data Umbrella scikit-learn Open Source SprintsAuthor: Reshama Shaikh , Angela Okune
The First Common Fund Data Ecosystem Hackathon
We ran a successful pilot hackathon, and we will run a second one soon!
Mestrado em Ciência da Computação 2022: MetaheurísticasEstamos ainda com algumas vagas abertas para o Mestrado em Ciência da Computação na UFPA, Belém. Os interessados, favor olhar as instruções para submissão na página de seleção do programa. Desde meu ingresso no programa venho orientando alunos em diferentes pesquisas sobre inteligência computacional aplicados a problemas de smart grids. Já tivemos trabalhos sobre sistemas multiagentes… Continue a ler »Mestrado em Ciência da Computação 2022: Metaheurísticas
DiffCast: Hands-free Python Screencast Creator — Create reproducible programming screencasts without typos or edits
Programming screencasts are a popular way to teach programming and demo tools. Typically people will open up their favorite editor and record themselves tapping away. But this has a few problems. A good setup for coding isn't necessarily a good setup for video -- with text too small, a window too …
On minimum metagenome covers, and calculating them for your own data.
You, too, can run our software!
Optimization Nuggets: Implicit Bias of Gradient-based Methods
When an optimization problem has multiple global minima, different algorithms can find different solutions, a phenomenon often referred to as the implicit bias of optimization algorithms. In this post we'll characterize the implicit bias of gradient-based methods on a class of regression problems that includes linear least squares and Huber …
Optimization Nuggets: Exponential Convergence of SGD
This is the first of a series of blog posts on short and beautiful proofs in optimization (let me know what you think in the comments!). For this first post in the series I'll show that stochastic gradient descent (SGD) converges exponentially fast to a neighborhood of the solution.
A bioinformatics training career panel in the DIB Lab
Careers in training!
Hiring an engineer and post-doc to simplify data science on dirty data
Join us to work on reinventing data-science practices and tools to produce robust analysis with less data curation.
It is well known that data cleaning and preparation are a heavy burden to the data scientist.
In the dirty data project, we have been conducting machine-learning research …
TorchVision Datasets: Getting Started
The TorchVision datasets subpackage is a convenient utility for accessing well-known public image and video datasets. You can use these tools to start training new computer vision models very quickly. TorchVision Datasets Example To get started, all you have to do is import one of the Dataset classes. Then, instantiate ... Read more
The post TorchVision Datasets: Getting Started appeared first on Sparrow Computing.
NumPy Any: Understanding np.any()
The np.any() function tests whether any element in a NumPy array evaluates to true: The input can have any shape and the data type does not have to be boolean (as long as it’s truthy). If none of the elements evaluate to true, the function returns false: Passing in a ... Read more
The post NumPy Any: Understanding np.any() appeared first on Sparrow Computing.
PyTorch DataLoader Quick Start
PyTorch comes with powerful data loading capabilities out of the box. But with great power comes great responsibility and that makes data loading in PyTorch a fairly advanced topic. One of the best ways to learn advanced topics is to start with the happy path. Then add complexity when you ... Read more
The post PyTorch DataLoader Quick Start appeared first on Sparrow Computing.
How the NumPy append operation works
Understanding the np.append() operation and when you might want to use it.
The post How the NumPy append operation works appeared first on Sparrow Computing.
Hiring someone to develop scikit-learn community and industry partners
With the growth of scikit-learn and the wider PyData ecosystem, we want to recruit in the Inria scikit-learn team for a new role. Departing from our usual focus on excellence in algorithms, statistics, or code, we want to add to the team someone with some technical understanding, but an …
Using snakemake to do simple wildcard operations on many, many, many files
snakemake is awesome
A paper on the Lees-Edwards method
A few years ago1, Sebastian contacted me to help with simulations. Great, I like simulation studies, so we start discussing the details. The idea: use an established method, the Lees-Edwards boundary condition, to study colloids under shear.
A biotech career panel in the DIB Lab
Careers outside of universities!
Scaling sourmash to millions of samples
Bigger and better!
New sourmash databases are available!
Databases are now available for GTDB!
Colunando no O Estado do PiauíO Estado do Piauí é um novo jornal que surgiu recentemente pelas bandas de lá. Com um foco maior em reportagens longas e densas, misturando jornalismo investigativo e literário, o projeto pretende discutir em profundidade os temas de interesse do estado, descobrir histórias piauienses únicas, repercutir situações problemáticas, apontar alternativas e muito mais. Não se… Continue a ler »Colunando no O Estado do Piauí
Ciclo de Entrevistas sobre as Pesquisas no PPGCC da UFPA – Inteligência ComputacionalA Faculdade de Computação e o Programa de Pós-Graduação em Ciência da Computação da UFPA estão desenvolvendo um projeto que pretende atingir dois objetivos: o primeiro, fazer uma melhor divulgação para o público externo à universidade do que produzimos em nossas pesquisas; o segundo, uma melhor divulgação INTERNA da mesma coisa – o que desenvolvemos… Continue a ler »Ciclo de Entrevistas sobre as Pesquisas no PPGCC da UFPA – Inteligência Computacional
Moving sourmash towards more community engagement - a funding application
CZI EOSS4 application for sourmash support
Searching all public metagenomes with sourmash
Searching all the things!
Is your software ready for the Journal of Open Source Software?
For the unaware reader, the Journal of Open Source Software (JOSS) is an open-access scientific journal founded in 2016 and aimed at publishing scientific software. A JOSS article in itself is short and its publication contributes to recognize the work on the software. I share here my point of view on what makes some software tools more ready to be published in JOSS. I do not comment on the size or the relevance for research which are both documented on JOSS' website.
sourmash 4.1.0 released!!
sourmash v4.1.0 is here!
On the Link Between Optimization and Polynomials, Part 4
While the most common accelerated methods like Polyak and Nesterov incorporate a momentum term, a little known fact is that simple gradient descent –no momentum– can achieve the same rate through only a well-chosen sequence of step-sizes. In this post we'll derive this method and through simulations discuss its practical …
NumFOCUS Welcomes Tesco Technology to Corporate Sponsors
NumFOCUS is pleased to announce our new partnership with Tesco Technology. A long-time PyData event sponsor, Tesco Technology joined NumFOCUS as a Silver Corporate Sponsor in December 2020. “We are very excited to formalize our partnership with Tesco Technology,” said Leah Silen, NumFOCUS Executive Director. “Tesco Technology has partnered with NumFOCUS for the past several […]
The post NumFOCUS Welcomes Tesco Technology to Corporate Sponsors appeared first on NumFOCUS.
Job Posting | Communications and Marketing Manager
Job Title: Communications and Marketing Manager Position Overview The primary role of the Communications & Marketing Manager is to manage the NumFOCUS brand by overseeing all outgoing communications between NumFOCUS and our stakeholders. You will serve the project communities by playing a key role in their event marketing management and assist with project promotional and […]
The post Job Posting | Communications and Marketing Manager appeared first on NumFOCUS.
Getting started with Acoular - Part 2This is the second in a series of three blog posts about the basic use of Acoular. It assumes that you already have read the first post and continues by explaining some more concepts and additional methods. Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array. The focus of the processing is on the construction of a map of acoustic sources. This is somewhat similar to taking an acoustic photograph of some sound sources.
Getting started with Acoular - Part 3This is the third and final in a series of three blog posts about the basic use of Acoular. It assumes that you already have read the first two posts and continues by explaining additional concepts to be used with time domain methods. Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array. The focus of the processing is on the construction of a map of acoustic sources. This is somewhat similar to taking an acoustic photograph of some sound sources. To continue, we do the same set up as in Part 1. However, as we are setting out to do some signal processing in time domain, we define only TimeSamples, MicGeom, RectGrid and SteeringVector objects but no PowerSpectra or BeamformerBase. import acoular ts = acoular.TimeSamples( name="three_sources.h5" ) mg = acoular.MicGeom( from_file="array_64.xml" ) rg = acoular.RectGrid( x_min=-0.2, x_max=0.2, y_min=-0.2, y_max=0.2, z=0.3, increment=0.01 ) st = acoular.SteeringVector( grid=rg, mics=mg (continued...)
Getting started with Acoular - Part 1This is the first in a series of three blog posts about the basic use of Acoular. It explains some fundamental concepts and walks through a simple example. Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array. The focus of the processing is on the construction of a map of acoustic sources. This is somewhat similar to taking an acoustic photograph of some sound sources.
sourmash 4.0 is now available! Low low cost if you buy now!
sourmash v4.0.0 is here!
On the Link Between Optimization and Polynomials, Part 3
I've seen things you people wouldn't believe.
Valleys sculpted by trigonometric functions.
Rates on fire off the shoulder of divergence.
Beams glitter in the dark near the Polyak gate.
All those landscapes will be lost in time, like tears in rain.
Time to halt.
A momentum optimizer *
Introducing PyMC Labs: Saving the World with Bayesian Modeling
After I left Quantopian in 2020, something interesting happened: various companies contacted me inquiring about consulting to help them with their PyMC3 models.
Usually, I don't hear how people are using PyMC3 -- they mostly show up on GitHub or Discourse when something isn't working right. So, hearing about all these …
Using MicroPython and uploading libraries on Raspberry Pi Pico — Using rshell to upload custom code
MicroPython is an implementation of the Python 3 programming language, optimized to run microcontrollers. It's one of the options available for programming your Raspberry Pi Pico and a nice friendly way to get started with microcontrollers.
MicroPython can be installed easily on your Pico, by following the instructions on the …
sourmash v4.0.0 release candidate 1 is now available for comment!
sourmash v4.0.0 is coming!
Job Posting | Events and Digital Marketing Coordinator
Job Title: Events and Digital Marketing Coordinator Position Overview The primary role of the Events and Digital Marketing Coordinator is to support and assist the Events Manager and the Community Communications and Marketing Manager to advance one of NumFOCUS’s primary missions of educating and building the community of users and developers of open source scientific […]
The post Job Posting | Events and Digital Marketing Coordinator appeared first on NumFOCUS.
Transition your Python project to use pyproject.toml and setup.cfg! (An example.)
Updating old Python packages, in this year of the PSF 2021!
Writing a SAM Coupé SCREEN$ Converter in Python — Interrupt optimizing image converter
The SAM Coupé was a British 8 bit home computer that was pitched as a successor to the ZX Spectrum, featuring improved graphics and sound and higher processor speed.
The SAM Coupé's high-color MODE4 could manage 256x192 resolution graphics, with 16 colors from a choice of 128. Each pixel can …
A snakemake hack for checkpoints
snakemake checkpoints r awesome
Squeezing Space Invaders onto the BBC micro:bit's 25 pixels — MicroPython retro game in just 25 pixels
How much game can you fit into 25 pixels? Quite a bit it turns out.
This is a mini clone of arcade classic Space Invaders for the BBC micro:bit microcomputer. Using the accelerometer and two buttons for input, to can beat off wave after wave of aliens that advance …