SciPy

Planet SciPy

ListenData 2022-06-30 14:04:00

Pointwise mutual information (PMI) in NLP

Natural Language Processing (NLP) has secured so much acceptance recently as there are many live projects running and now it's not just limited to academics only. Use cases of NLP can be seen across industries like understanding customers' issues, predicting the next word user is planning to type in the keyboard, automatic text summarization etc. Many researchers across the world trained NLP models in several human languages like English, Spanish, French, Mandarin etc so that benefit of NLP can be seen in every society. In this post we will talk about one of the most useful NLP metric called Pointwise mutual information (PMI) to identify words that can go together along with its implementation in Python and R.

Table of Contents

What is Pointwise mutual information?

PMI helps us to find related words. In other words, it explains how likely the co-occurrence of two words than we would expect by chance. For example the word "Data Science" has a specific meaning when

(continued...)
neptune.ai 2022-06-30 08:52:52

Kedro vs ZenML vs Metaflow: Which Pipeline Orchestration Tool Should You Choose?

In this article, I’m going to compare Kedro, Metaflow, and ZenML, but before that, I think it’s worth taking a few steps back. Why even bother using ML orchestration tools such as these three? It is not that hard to start a Machine Learning project. You install some python libraries, initiate the model, train it, […]

The post Kedro vs ZenML vs Metaflow: Which Pipeline Orchestration Tool Should You Choose? appeared first on neptune.ai.

Anaconda Blog 2022-06-29 13:12:00

8 Levels of Reproducibility: Future-Proofing Your Python Projects

Anaconda is amplifying the voices of some of its most active and cherished community members in a monthly blog series. If you’re a Maker who has been looking for a chance to tell your story, elaborate on a favorite project, educate your peers, and build your personal brand, consider submitting an abstract. For more details and to access a wealth of educational data science resources and discussion threads, visit Anaconda Nucleus.
neptune.ai 2022-06-28 05:42:19

Real-World MLOps Examples: Model Development in Hypefactors

In this first installment of the series “Real-world MLOps Examples,” Jules Belveze, an MLOps Engineer, will walk you through the model development process at Hypefactors, including the types of models they build, how they design their training pipeline, and other details you may find valuable. Enjoy the chat! Company profile Hypefactors provides an all-in-one media […]

The post Real-World MLOps Examples: Model Development in Hypefactors appeared first on neptune.ai.

Acoular 2022-06-24 05:00:00

How to import your data into Acoular

Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array which is stored in an HDF5 file. This blog post explains how to convert data available in other formats into this file format. As examples for other file formats we will use both .csv (comma separated text files) and .mat (Matlab files).
neptune.ai 2022-06-23 10:45:47

5 Model Deployment Mistakes That Can Cost You a Lot

In Data Science projects, model deployment is probably the most critical and complex part of the whole lifecycle. Operational or mission-critical ML requires thorough design. You have to think about artifacts lineage and tracking, automatic deployments to avoid human errors, testing, and quality checks, feature availability when the model is online… and many more things. […]

The post 5 Model Deployment Mistakes That Can Cost You a Lot appeared first on neptune.ai.

Anaconda Blog 2022-06-22 13:00:00

Anaconda Acquires PythonAnywhere to Increase Python Accessibility and Adoption

At Anaconda, we are always seeking new ways to empower people with data literacy. With that in mind, we built the Anaconda Distribution to include an easy-to-use package and environment manager, packages that contain their cross-language dependencies, installers for all major operating systems and architectures, and a desktop console with direct access to all the tools and artifacts a developer needs for their data science and machine learning projects.
Anaconda Blog 2022-06-16 13:18:00

A Case for R & R: My Women Who Code CONNECT Recharge 2022 Keynote

When the phone battery drains, what do we do? When the laptop gets overheated, what do we do? We are forced to recharge, shut down, or buy a replacement. In this era, the blinking red battery icons on our smartphones, tablets, and laptops send us scrambling for our chargers.
neptune.ai 2022-06-14 08:58:21

MLOps at a Reasonable Scale [The Ultimate Guide]

For a couple of years now, MLOps is probably the most (over)used term in the ML industry. The more models people want to deploy to production, the more they think about how to organize the Ops part of this process.  Naturally, the way to do MLOps has been shaped by the big players on the […]

The post MLOps at a Reasonable Scale [The Ultimate Guide] appeared first on neptune.ai.

Quansight Labs 2022-06-10 16:00:00

Checking for accessibility: thoughts and a checklist!

JupyterLab Accessibility Journey Part 4

Remember how my last post in this series called out accessibility as much more complex than a checklist? True to my sense of humor, this blog post is now a checklist. Irony? I don’t know the meaning of the word.

Okay, okay. But seriously, here's how we got here. When I’m not making my own work, much of my time is spent reviewing other people’s work. Whether it’s design files, code contributions, blog posts, documentation, or who-knows-what-this-week, I often find myself asking questions and giving feedback about accessibility in the review process. This has prompted multiple people to ask me what it is I’m considering when I review for accessibility. Enough people have now asked that I’ve decided to write something down -- and it's turned into a checklist.

Read more… (9 min remaining to read)

neptune.ai 2022-06-08 12:58:28

Imbalanced Data in Object Detection Computer Vision Projects

One of the typical issues faced by data science practitioners is the data imbalance problem. It plagues every other ML project and we all have faced it while working on some classification problem.  There can be several types of data imbalances. For instance, one of the most frequently discussed problems is class imbalance. While collecting […]

The post Imbalanced Data in Object Detection Computer Vision Projects appeared first on neptune.ai.

Anaconda Blog 2022-06-07 21:17:00

The True Value of Community

With more than 25 years years of executive-level financial development experience for a wide range of businesses, Angela now provides financial stewardship and executive leadership at Anaconda. Angela is no stranger to crafting financial management plans for technology leaders. In her tenure as the CFO for AirStrip Technologies, a med-tech software company backed by Sequoia Capital, she supported the successful scale of the business and raised more than $100 million in equity and debt financing. Previously, as CFO of Trillion Partners, she spearheaded the effort to raise $60 million in debt and private equity for the business. As VP of Finance at Broadwing, Angela led the raising of more than $200 million and managed the company's M&A activities and investor relations functions.
neptune.ai 2022-06-07 14:45:24

AutoML Solutions: What I Like and Don’t Like About AutoML as a Data Scientist

There’s a sentiment that AutoML could leave a lot of Data Scientists jobless. Will it? Short answer – Nope. In fact, even if AutoML solutions become 10x better, it will not make Machine Learning specialists of any trade irrelevant.  Why the optimism, you may ask? Because although a technical marvel, AutoML is no silver bullet. […]

The post AutoML Solutions: What I Like and Don’t Like About AutoML as a Data Scientist appeared first on neptune.ai.

neptune.ai 2022-06-01 14:05:22

Automated Testing in Machine Learning Projects [Best Practices for MLOps]

Automated testing in machine learning is a very useful segment of the ML project which can make some long-term differences. Probably underrated in the early stages of development, it gets attention only in the late stages, when the system starts to break apart with annoying bugs which only grow with time. To ease these issues […]

The post Automated Testing in Machine Learning Projects [Best Practices for MLOps] appeared first on neptune.ai.

neptune.ai 2022-05-27 15:01:50

How to Test a Recommender System

Recommender systems fundamentally address the question – What do people want? Although it is an extensive question, in the context of a consumer application like e-commerce, the answer could be to serve the best products in terms of price and quality for a consumer. For a news aggregator website, it could be to show reliable […]

The post How to Test a Recommender System appeared first on neptune.ai.

fa.bianp.net 2022-05-26 22:00:00

On the Link Between Optimization and Polynomials, Part 5


Six: All of this has happened before.
Baltar: But the question remains, does all of this have to happen again?
Six: This time I bet no.
Baltar: You know, I've never known you to play the optimist. Why the change of heart?
Six: Mathematics. Law of averages. Let a complex …

neptune.ai 2022-05-26 14:05:21

Serving Machine Learning Models With Docker: 5 Mistakes You Should Avoid

As you would already know that Docker is a tool that allows you to create and deploy isolated environments using containers for running your applications along with their dependencies. While we are at it let us briefly brush up on some basic concepts around Docker before we make way for the main topic. Why should […]

The post Serving Machine Learning Models With Docker: 5 Mistakes You Should Avoid appeared first on neptune.ai.

neptune.ai 2022-05-25 08:52:36

How to Deploy NLP Models in Production

NLP is currently one of the most exciting areas of ML, as the advent of Transformers and large language models such as GPT and BERT have redefined what is possible within the field. However, much of the focus in blogs and the popular media is on the models themselves and not on highly important practical […]

The post How to Deploy NLP Models in Production appeared first on neptune.ai.

Anaconda Blog 2022-05-24 17:54:00

5 Routes for Going from Zero to Viz in Data Science

About the Author Kathryn Hurchla is a data developer and designer at home shaping human experiences as an Analytics Lead with F Λ N T Λ S Y, a design agency like no other. She has a master's degree in data analytics and visualization and enjoys building end-to-end analytic applications and writing about visual data science. You can find her lost in exploratory data analysis. She contributes to open-source technology communities as a Plotly Dash Ambassador, by leading hands-on learning, and by publishing content independently and with Data Visualization Society’s Nightingale Editorial Committee. Her own enterprise Data Design Dimension may one day be just what her daughters need to make the world as they see it too. Her words are not a reflection of her employer.
scikit-learn Blog 2022-05-22 00:00:00

Interview with Norbert Preining, scikit-learn Team Member

Author: Reshama Shaikh , Norbert Preining
Anaconda Blog 2022-05-18 16:30:00

Asian, American, SVP

When I reflect on my journey as an Asian American woman in corporate America, I can tease out a few things that I feel have contributed to my success. First of all, the aforementioned strong will and confidence have come in handy. I’ve always believed in myself, propelling myself into my career head-and-heart-first and working hard to achieve my goals. I’ve found that if you consistently believe you can get the job done and then execute on that belief, other people become confident in you, too. Plus, I’ve had amazing mentors of all genders and ethnicities who shared their knowledge and modeled success. I can’t overstate how important it is to connect with mentors who can serve as guides and invest in your professional development.
Anaconda Blog 2022-05-06 21:50:00

New Release: Anaconda Distribution Now Supporting M1

2022.05 Anaconda Distribution
ListenData 2022-05-06 11:06:00

Only size-1 arrays can be converted to Python scalars

Numpy is one of the most used module in Python and it is used in a variety of tasks ranging from creating array to mathematical and statistical calculations. Numpy also bring efficiency in Python programming. While using numpy you may encounter this error TypeError: only size-1 arrays can be converted to Python scalars It is one of the frequently appearing error and sometimes it becomes a daunting challenge to solve it.
Meaning : Only Size 1 Arrays Can Be Converted To Python Scalars ErrorThis error generally appears when Python expects a single value but you passed an array which consists of multiple values. For example : you want to calculate exponential value of an array but the function for exponential value was designed for scalar variable (which means single value). When you pass numpy array in the function, it will return this error. This error handling is to prevent your code to process further and avoids unexpected output (continued...)
scikit-learn Blog 2022-05-04 00:00:00

Interview with Lucy Liu, scikit-learn Team Member

Author: Reshama Shaikh , Lucy Liu
Quansight Labs 2022-05-03 02:30:00

The evolution of the SciPy developer CLI

🤔 What is a command-line interface (CLI)?

Imagine a situation, where there is a massive system with various tools and functionalities, and every functionality requires a special command or an input from the user. A CLI is designed to tackle such situations. Like a catalog or menu, it lists all the options available, thus helping the user to navigate a complex system.

Now that we understand what a CLI is, how about we dive into the world of SciPy?

Read more… (5 min remaining to read)

Anaconda Blog 2022-04-30 16:00:00

New from Anaconda: Python in the Browser

PyScript wouldn't be here today without the help of some incredible people.
Anaconda Blog 2022-04-28 13:30:00

How Anaconda Is Rallying to Protect Commercial Users From Cybersecurity Threats

“You have the power, the capacity, and the responsibility to strengthen the cybersecurity and resilience of the critical services and technologies on which Americans rely. We need everyone to do their part to meet one of the defining threats of our time—your vigilance and urgency today can prevent or mitigate attacks tomorrow.”
Living in an Ivory Basement 2022-04-21 22:00:00

Storing 64-bit unsigned integers in SQLite databases, for fun and profit

Storing unsigned longs in SQLite is possible, and can be fast.

Anaconda Blog 2022-04-21 13:10:00

Making Data, Models, and Analytics Awesome

About the Author Mark Skov Madsen, PhD, CFA, is a Lead Trading Analyst at Ørsted. His team develops data, models, and analytics for Ørsted’s Traders end to end. They work in an analytics environment based on Azure DevOps, Kubernetes, JupyterHub and Python.
Quansight Labs 2022-04-10 11:00:00

Why is writing blog posts hard?

We write code. We write issues. We write documentation. We write notes to ourselves, messages to each other, and guidelines to unite teams across projects.

Day in and out our remote work and open source lives are driven by written communication. But blog posts are one kind of writing that eludes our regular practice. In our weekly show and tell we got real about "why can writing blog posts be so hard?" and collaboratively wrote up this blog post about what we learned from the discussion.

Read more… (4 min remaining to read)

Quansight Labs 2022-03-31 23:59:02

Making GPUs accessible to the PyData Ecosystem via the Array API Standard.

GPUs have become an essential part of the scientific computing stack and with the advancement in the technology around GPUs and the ease of accessing a GPU in the cloud or on-prem, it is in the best interest of the PyData community to spend time and effort to make GPUs accessible for the users of PyData libraries. A typical user in the PyData ecosystem is quite familiar with the APIs of libraries like SciPy, scikit-learn, and scikit-image -- and at the moment these libraries are largely limited to single-threaded operations on CPU (there are exceptions to that, like linear algebra functions and scikit-learn functionality which uses OpenMP under the hood). In this blog post I will talk about how we can use the Python Array API Standard with the fundamental libraries in the PyData ecosystem along with CuPy for making GPUs accessible to the users of these libraries. With the introduction of that standard by the Consortium for Python Data API Standards and its adoption mechanism in NEP 47 it

(continued...)
scikit-learn Blog 2022-03-21 00:00:00

Behind the Scenes of Data Umbrella scikit-learn Open Source Sprints

Author: Reshama Shaikh , Angela Okune
Living in an Ivory Basement 2022-03-04 23:00:00

The First Common Fund Data Ecosystem Hackathon

We ran a successful pilot hackathon, and we will run a second one soon!

Quansight Labs 2022-02-28 10:00:00

Jupyter accessibility efforts have a roadmap!

Really? Tell me more.

The Chan Zuckerberg Initiative has funded efforts to make the Jupyter ecosystem, starting with JupyterLab, more accessible (As was announced in a prior Jupyter blog post about grants in the ecosystem). You can read the full grant proposal for Jupyter accessibility, the proposal summary, or a GitHub Project list of the grant's milestones to get a sense of the grant's scope.

Read more… (1 min remaining to read)

scikit-learn Blog 2022-02-19 00:00:00

Three Components for Reviewing a Pull Request

Author: Thomas J. Fan
scikit-learn Blog 2022-02-08 00:00:00

Performances and scikit-learn

Author: Julien Jerphanion
scikit-learn Blog 2022-02-07 00:00:00

An Open Source Software Award for scikit-learn

Author: François Goupil
Filipe Saraiva's blog 2022-02-06 14:31:39

Mestrado em Ciência da Computação 2022: Metaheurísticas

Estamos ainda com algumas vagas abertas para o Mestrado em Ciência da Computação na UFPA, Belém. Os interessados, favor olhar as instruções para submissão na página de seleção do programa. Desde meu ingresso no programa venho orientando alunos em diferentes pesquisas sobre inteligência computacional aplicados a problemas de smart grids. Já tivemos trabalhos sobre sistemas multiagentes… Continue a ler »Mestrado em Ciência da Computação 2022: Metaheurísticas
Martin Fitzpatrick - python 2022-01-26 11:00:00

DiffCast: Hands-free Python Screencast Creator — Create reproducible programming screencasts without typos or edits

Programming screencasts are a popular way to teach programming and demo tools. Typically people will open up their favorite editor and record themselves tapping away. But this has a few problems. A good setup for coding isn't necessarily a good setup for video -- with text too small, a window too …

Quansight Labs 2022-01-19 10:00:00

Conda and Grayskull, the Masters of Software Packaging

Python might be the most popular snake out there, but most of us have also heard of that other serpent: Conda. And some of us have wondered what it really is. In this post we’ll learn about Conda, software packages and package recipes. Most importantly we’ll learn about Grayskull — a conda recipe generator.

Read more… (6 min remaining to read)

Quansight Labs 2022-01-12 13:00:00

IPython 8.0, Lessons learned maintaining software

This is a companion post from the Official release of IPython 8.0, that describe what we learned with this large new major IPython release. We hope it will help you apply best practices, and have an easier time maintaining your projects, or helping other. We'll focus on many patterns that made it easier for us to make IPython 8.0 what it is with minimal time involved.

Read more… (8 min remaining to read)

fa.bianp.net 2022-01-09 23:00:00

Optimization Nuggets: Implicit Bias of Gradient-based Methods

When an optimization problem has multiple global minima, different algorithms can find different solutions, a phenomenon often referred to as the implicit bias of optimization algorithms. In this post we'll characterize the implicit bias of gradient-based methods on a class of regression problems that includes linear least squares and Huber …

fa.bianp.net 2021-12-14 23:00:00

Optimization Nuggets: Exponential Convergence of SGD

This is the first of a series of blog posts on short and beautiful proofs in optimization (let me know what you think in the comments!). For this first post in the series I'll show that stochastic gradient descent (SGD) converges exponentially fast to a neighborhood of the solution.

Quansight Labs 2021-12-10 06:00:00

A year of Jupyter community calls

A framing for open source is that the software and code are kernels of community. The code, and its abstractions, unite developers and their patrons; a struggle for growing/evolving open communities is to make sure these groups remain connected. A lot of us showed up for the code, but hung around for the community. We'll continue this post talking about the monthly Jupyter community calls, and how they help all jovyans, Project Jupyter's pet name for their developers and users, stay connected.

Read more… (2 min remaining to read)

Quansight Labs 2021-11-17 10:00:00

A vision for extensibility to GPU & distributed support for SciPy, scikit-learn, scikit-image and beyond

Over the years, array computing in Python has evolved to support distributed arrays, GPU arrays, and other various kinds of arrays that work with specialized hardware, or carry additional metadata, or use different internal memory representations. The foundational library for array computing in the PyData ecosystem is NumPy. But NumPy alone is a CPU-only library - and a single-threaded one at that - and in a world where it's possible to get a GPU or a CPU with a large core count in the cloud cheaply or even for free in a matter of seconds, that may not seem enough. For the past couple of years, a lot of thought and effort has been spent on devising mechanisms to tackle this problem, and evolve the ecosystem in a gradual way towards a state where PyData libraries can run on a GPU, as well as in distributed mode across multiple GPUs.

We feel like a shared vision has emerged, in bits and pieces. In this post, we aim to articulate that vision and

(continued...)
Quansight Labs 2021-11-03 17:23:40

NumPy Benchmarking

In this blog post, I'll be talking about my journey in Quansight. I want to share all things I was involved in and accomplished. What issues I faced, and most importantly, what were awesome life hacks I learned during this period.

First of all, I'd like to express my gratitude to the whole team for allowing me to be a part of such a great team. My work was majorly focused on providing performance benchmarks to NumPy in realistic situations. The target was to show the world that NumPy is efficient in handling quasi real-life situations too.

The primary technical outcome of my work is available in the numpy documentation.

Read more… (6 min remaining to read)

Gaël Varoquaux - programming 2021-10-28 22:00:00

Hiring an engineer and post-doc to simplify data science on dirty data

Note

Join us to work on reinventing data-science practices and tools to produce robust analysis with less data curation.

It is well known that data cleaning and preparation are a heavy burden to the data scientist.

Dirty data research

In the dirty data project, we have been conducting machine-learning research …

Sparrow Computing 2021-10-22 21:27:39

TorchVision Datasets: Getting Started

The TorchVision datasets subpackage is a convenient utility for accessing well-known public image and video datasets. You can use these tools to start training new computer vision models very quickly. TorchVision Datasets Example To get started, all you have to do is import one of the Dataset classes. Then, instantiate it and access one of ... Read more

The post TorchVision Datasets: Getting Started appeared first on Sparrow Computing.

Sparrow Computing 2021-10-21 14:19:21

NumPy Any: Understanding np.any()

The np.any() function tests whether any element in a NumPy array evaluates to true: The input can have any shape and the data type does not have to be boolean (as long as it’s truthy). If none of the elements evaluate to true, the function returns false: Passing in a value for the axis argument ... Read more

The post NumPy Any: Understanding np.any() appeared first on Sparrow Computing.

Sparrow Computing 2021-10-07 20:52:03

PyTorch DataLoader Quick Start

PyTorch comes with powerful data loading capabilities out of the box. But with great power comes great responsibility and that makes data loading in PyTorch a fairly advanced topic. One of the best ways to learn advanced topics is to start with the happy path. Then add complexity when you find out you need it. ... Read more

The post PyTorch DataLoader Quick Start appeared first on Sparrow Computing.

Sparrow Computing 2021-10-06 16:53:32

How the NumPy append operation works

Understanding the np.append() operation and when you might want to use it.

The post How the NumPy append operation works appeared first on Sparrow Computing.

Gaël Varoquaux - programming 2021-09-13 22:00:00

Hiring someone to develop scikit-learn community and industry partners

Note

With the growth of scikit-learn and the wider PyData ecosystem, we want to recruit in the Inria scikit-learn team for a new role. Departing from our usual focus on excellence in algorithms, statistics, or code, we want to add to the team someone with some technical understanding, but an …

Pierre de Buyl's homepage - scipy 2021-08-24 13:00:00

A paper on the Lees-Edwards method

A few years ago1, Sebastian contacted me to help with simulations. Great, I like simulation studies, so we start discussing the details. The idea: use an established method, the Lees-Edwards boundary condition, to study colloids under shear.

Living in an Ivory Basement 2021-07-19 22:00:00

A biotech career panel in the DIB Lab

Careers outside of universities!

Sparrow Computing 2021-07-08 16:09:47

Poetry for Package Management in Machine Learning Projects

When you’re building a production machine learning system, reproducibility is a proxy for the effectiveness of your development process. But without locking all your Python dependencies, your builds are not actually repeatable. If you work in a Python project without locking long enough, you will eventually get a broken build because of a transitive dependency ... Read more

The post Poetry for Package Management in Machine Learning Projects appeared first on Sparrow Computing.

Sparrow Computing 2021-06-29 20:38:29

Development containers in VS Code: a quick start guide

If you’re building production ML systems, dev containers are the killer feature of VS Code. Dev containers give you full VS Code functionality inside a Docker container. This lets you unify your dev and production environments if production is a Docker container. But even if you’re not targeting a Docker deployment, running your code in ... Read more

The post Development containers in VS Code: a quick start guide appeared first on Sparrow Computing.

Living in an Ivory Basement 2021-06-28 22:00:00

New sourmash databases are available!

Databases are now available for GTDB!

Filipe Saraiva's blog 2021-06-25 12:06:45

Colunando no O Estado do Piauí

O Estado do Piauí é um novo jornal que surgiu recentemente pelas bandas de lá. Com um foco maior em reportagens longas e densas, misturando jornalismo investigativo e literário, o projeto pretende discutir em profundidade os temas de interesse do estado, descobrir histórias piauienses únicas, repercutir situações problemáticas, apontar alternativas e muito mais. Não se… Continue a ler »Colunando no O Estado do Piauí
Filipe Saraiva's blog 2021-06-21 21:51:57

Ciclo de Entrevistas sobre as Pesquisas no PPGCC da UFPA – Inteligência Computacional

A Faculdade de Computação e o Programa de Pós-Graduação em Ciência da Computação da UFPA estão desenvolvendo um projeto que pretende atingir dois objetivos: o primeiro, fazer uma melhor divulgação para o público externo à universidade do que produzimos em nossas pesquisas; o segundo, uma melhor divulgação INTERNA da mesma coisa – o que desenvolvemos… Continue a ler »Ciclo de Entrevistas sobre as Pesquisas no PPGCC da UFPA – Inteligência Computacional
Living in an Ivory Basement 2021-06-07 22:00:00

Searching all public metagenomes with sourmash

Searching all the things!

Pierre de Buyl's homepage - scipy 2021-05-21 13:00:00

Is your software ready for the Journal of Open Source Software?

For the unaware reader, the Journal of Open Source Software (JOSS) is an open-access scientific journal founded in 2016 and aimed at publishing scientific software. A JOSS article in itself is short and its publication contributes to recognize the work on the software. I share here my point of view on what makes some software tools more ready to be published in JOSS. I do not comment on the size or the relevance for research which are both documented on JOSS' website.

Living in an Ivory Basement 2021-05-16 22:00:00

sourmash 4.1.0 released!!

sourmash v4.1.0 is here!

Sparrow Computing 2021-05-14 20:11:16

Basic Counting in Python

I love fancy machine learning algorithms as much as anyone. But sometimes, you just need to count things. And Python’s built-in data structures make this really easy. Let’s say we have a list of strings: With a list like this, you might care about a few different counts. What’s the count of all items? What’s ... Read more

The post Basic Counting in Python appeared first on Sparrow Computing.

Sparrow Computing 2021-05-13 18:11:11

How to Use the PyTorch Sigmoid Operation

The PyTorch sigmoid function is an element-wise operation that squishes any real number into a range between 0 and 1. This is a very common activation function to use as the last layer of binary classifiers (including logistic regression) because it lets you treat model predictions like probabilities that their outputs are true, i.e. p(y ... Read more

The post How to Use the PyTorch Sigmoid Operation appeared first on Sparrow Computing.

fa.bianp.net 2021-04-12 22:00:00

On the Link Between Optimization and Polynomials, Part 4

While the most common accelerated methods like Polyak and Nesterov incorporate a momentum term, a little known fact is that simple gradient descent –no momentum– can achieve the same rate through only a well-chosen sequence of step-sizes. In this post we'll derive this method and through simulations discuss its practical …

NumFOCUS 2021-04-09 18:02:05

NumFOCUS Welcomes Tesco Technology to Corporate Sponsors

NumFOCUS is pleased to announce our new partnership with Tesco Technology. A long-time PyData event sponsor, Tesco Technology joined NumFOCUS as a Silver Corporate Sponsor in December 2020. “We are very excited to formalize our partnership with Tesco Technology,” said Leah Silen, NumFOCUS Executive Director. “Tesco Technology has partnered with NumFOCUS for the past several […]

The post NumFOCUS Welcomes Tesco Technology to Corporate Sponsors appeared first on NumFOCUS.

NumFOCUS 2021-04-08 21:14:55

Job Posting | Communications and Marketing Manager

Job Title: Communications and Marketing Manager Position Overview The primary role of the Communications & Marketing Manager is to manage the NumFOCUS brand by overseeing all outgoing communications between NumFOCUS and our stakeholders. You will serve the project communities by playing a key role in their event marketing management and assist with project promotional and […]

The post Job Posting | Communications and Marketing Manager appeared first on NumFOCUS.

Acoular 2021-04-01 05:00:00

Getting started with Acoular - Part 2

This is the second in a series of three blog posts about the basic use of Acoular. It assumes that you already have read the first post and continues by explaining some more concepts and additional methods. Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array. The focus of the processing is on the construction of a map of acoustic sources. This is somewhat similar to taking an acoustic photograph of some sound sources.
Acoular 2021-04-01 05:00:00

Getting started with Acoular - Part 3

This is the third and final in a series of three blog posts about the basic use of Acoular. It assumes that you already have read the first two posts and continues by explaining additional concepts to be used with time domain methods. Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array. The focus of the processing is on the construction of a map of acoustic sources. This is somewhat similar to taking an acoustic photograph of some sound sources. To continue, we do the same set up as in Part 1. However, as we are setting out to do some signal processing in time domain, we define only TimeSamples, MicGeom, RectGrid and SteeringVector objects but no PowerSpectra or BeamformerBase. import acoular ts = acoular.TimeSamples( name="three_sources.h5" ) mg = acoular.MicGeom( from_file="array_64.xml" ) rg = acoular.RectGrid( x_min=-0.2, x_max=0.2, y_min=-0.2, y_max=0.2, z=0.3, increment=0.01 ) st = acoular.SteeringVector( grid=rg, mics=mg (continued...)
Acoular 2021-04-01 05:00:00

Getting started with Acoular - Part 1

This is the first in a series of three blog posts about the basic use of Acoular. It explains some fundamental concepts and walks through a simple example. Acoular is a Python library that processes multichannel data (up to a few hundred channels) from acoustic measurements with a microphone array. The focus of the processing is on the construction of a map of acoustic sources. This is somewhat similar to taking an acoustic photograph of some sound sources.
Sparrow Computing 2021-03-22 23:54:00

PyTorch Tensor to NumPy Array and Back

You can easily convert a NumPy array to a PyTorch tensor and a PyTorch tensor to a NumPy array. This post explains how it works.

The post PyTorch Tensor to NumPy Array and Back appeared first on Sparrow Computing.

Sparrow Computing 2021-03-20 03:15:00

TorchVision Transforms: Image Preprocessing in PyTorch

TorchVision, a PyTorch computer vision package, has a great API for image pre-processing in its torchvision.transforms module. This post gives some basic usage examples, describes the API and shows you how to create and use custom image transforms.

The post TorchVision Transforms: Image Preprocessing in PyTorch appeared first on Sparrow Computing.

fa.bianp.net 2021-03-01 23:00:00

On the Link Between Optimization and Polynomials, Part 3

I've seen things you people wouldn't believe.
Valleys sculpted by trigonometric functions.
Rates on fire off the shoulder of divergence.
Beams glitter in the dark near the Polyak gate.
All those landscapes will be lost in time, like tears in rain.
Time to halt.

A momentum optimizer *

While My MCMC Gently Samples 2021-02-23 15:00:00

Introducing PyMC Labs: Saving the World with Bayesian Modeling

After I left Quantopian in 2020, something interesting happened: various companies contacted me inquiring about consulting to help them with their PyMC3 models.

Usually, I don't hear how people are using PyMC3 -- they mostly show up on GitHub or Discourse when something isn't working right. So, hearing about all these …

Martin Fitzpatrick - python 2021-02-22 08:00:00

Using MicroPython and uploading libraries on Raspberry Pi Pico — Using rshell to upload custom code

MicroPython is an implementation of the Python 3 programming language, optimized to run microcontrollers. It's one of the options available for programming your Raspberry Pi Pico and a nice friendly way to get started with microcontrollers.

MicroPython can be installed easily on your Pico, by following the instructions on the …

NumFOCUS 2021-02-10 19:54:10

Job Posting | Events and Digital Marketing Coordinator

Job Title: Events and Digital Marketing Coordinator Position Overview The primary role of the Events and Digital Marketing Coordinator is to support and assist the Events Manager and the Community Communications and Marketing Manager to advance one of NumFOCUS’s primary missions of educating and building the community of users and developers of open source scientific […]

The post Job Posting | Events and Digital Marketing Coordinator appeared first on NumFOCUS.

Living in an Ivory Basement 2021-02-01 23:00:00

Transition your Python project to use pyproject.toml and setup.cfg! (An example.)

Updating old Python packages, in this year of the PSF 2021!

Martin Fitzpatrick - python 2021-01-28 14:00:00

Writing a SAM Coupé SCREEN$ Converter in Python — Interrupt optimizing image converter

The SAM Coupé was a British 8 bit home computer that was pitched as a successor to the ZX Spectrum, featuring improved graphics and sound and higher processor speed.

The SAM Coupé's high-color MODE4 could manage 256x192 resolution graphics, with 16 colors from a choice of 128. Each pixel can …

Living in an Ivory Basement 2021-01-24 23:00:00

A snakemake hack for checkpoints

snakemake checkpoints r awesome

Martin Fitzpatrick - python 2021-01-21 07:00:00

Squeezing Space Invaders onto the BBC micro:bit's 25 pixels — MicroPython retro game in just 25 pixels

How much game can you fit into 25 pixels? Quite a bit it turns out.

This is a mini clone of arcade classic Space Invaders for the BBC micro:bit microcomputer. Using the accelerometer and two buttons for input, to can beat off wave after wave of aliens that advance …

ListenData 2021-01-06 10:35:00

Run SAS in Python without Installation

Introduction
In the past few years python has gained a huge popularity as a programming language in data science world. Many banks and pharma organisations have started using Python and some of them are in transition stage, migrating SAS syntax library to Python. Many big organisations have been using SAS since early 2000 and they developed a hundreds of SAS codes for various tasks ranging from data extraction to model building and validation. Hence it's a marathon task to migrate SAS code to any other programming language. Migration can only be done in phases so day to day tasks would not be hit by development and testing of python code. Since Python is open source it becomes difficult sometimes in terms of maintaining the existing code. Some SAS procedures are very robust and powerful in nature its alternative in Python is still not implemented, might be doable but not a straightforward way for average developer or analyst.

Do you wish

(continued...)
Filipe Saraiva's blog 2020-12-30 12:43:56

Disnatia X/Potências de X

Nenhuma equipe de heróis me é tão querida quanto X-Men. Lá pelo final dos anos 90 comecei a colecionar por alguns anos, mas em seguida veio o fatídico aumento de preço com as Super-Heróis Premium, o que me acabou desmotivando a comprar. De lá para cá, acompanho esporadicamente, lendo notícias sobre, comprando uma ou outra… Continue a ler »Disnatia X/Potências de X
ListenData 2020-12-21 14:50:00

Wish Christmas with Python and R

This post is dedicated to all the Python and R Programming Lovers...Flaunt your knowledge in your peer group with the following programs. As a data science professional, you want your wish to be special on eve of christmas. If you observe the code, you may also learn 1-2 tricks which you can use later in your daily tasks.

Method 1 : Run the following program and see what I mean

R Code


paste(intToUtf8(acos(log(1))*180/pi-13),
toupper(substr(month.name[2],2,2)),
paste(rep(intToUtf8(acos(exp(0)/2)*180/pi+2^4+3*2),2), collapse = intToUtf8(0)),
LETTERS[5^(3-1)], intToUtf8(atan(1/sqrt(3))*180/pi+2),
toupper(substr(month.abb[10],2,2)),
intToUtf8(acos(log(1))*180/pi-(2*3^2)),
toupper(substr(month.name[4],3,4)),
intToUtf8(acos(exp(0)/2)*180/pi+2^4+3*2+1),
intToUtf8(acos(exp(0)/2)*180/pi+2^4+2*4),
intToUtf8(acos(log(1))*180/pi-13),
LETTERS[median(0:2)],
intToUtf8(atan(1/sqrt(3))*180/pi*3-7),
sep = intToUtf8(0)
)

Python Code


import math
import datetime

(chr(int(math.acos(math.log(1))*180/math.pi-13)) \
+ datetime.date(1900, 2, 1).strftime('%B')[1] \
+ 2 * datetime.date(1900, 2, 1).strftime('%B')[3] \
+ datetime.date(1900, 2, 1).strftime('%B')[7] \
+ chr(int(math.atan(1/math.sqrt(3))*180/math.pi+2)) \
+ datetime.date(1900, 10, 1).strftime('%B')[1] \
+ chr(int(math.acos(math.log(1))*180/math.pi-18)) \
+ datetime.date(1900, 4, 1).strftime('%B')[2:4] \
+ chr(int(math.acos(math.exp(0)/2)*180/math.pi+2**4+3*2+1)) \
+ chr(int(math.acos(math.exp(0)/2)*180/math.pi+2**4+2*4)) \
+ chr(int(math.acos(math.log(1))*180/math.pi-13)) \
+ "{:c}".format(97) \
+ chr(int(math.atan(1/math.sqrt(3))*180/math.pi*3-7))).upper()
Method 2 : Audio Wish for Christmas

Turn on computer speakers before running the code.

R Code



install.packages("audio")
library(audio)
christmas_file <- tempfile()
download.file("https://github.com/deepanshu88/Datasets/raw/master/UploadedFiles/merrychristmas1.wav", christmas_file, mode = "wb")
xmas
(continued...)
fa.bianp.net 2020-12-20 23:00:00

On the Link Between Optimization and Polynomials, Part 2

We can tighten the analysis of gradient descent with momentum through a cobination of Chebyshev polynomials of the first and second kind. Following this connection, we'll derive one of the most iconic methods in optimization: Polyak momentum.

ListenData 2020-12-19 15:59:00

How to use variable in a query in pandas

Suppose you want to reference a variable in a query in pandas package in Python. This seems to be a straightforward task but it becomes daunting sometimes. Let's discuss it with examples in the article below.

Let's create a sample dataframe having 3 columns and 4 rows. This dataframe is used for demonstration purpose.


import pandas as pd
df = pd.DataFrame({"col1" : range(1,5),
"col2" : ['A A','B B','A A','B B'],
"col3" : ['A A','A A','B B','B B']
})
Filter a value A A in column col2
In order to do reference of a variable in query, you need to use @.
Mention
(continued...)
NumFOCUS 2020-12-18 21:21:54

NumFOCUS hires Open Source Developer Advocate!

  NumFOCUS is pleased to announce that Arliss Collins has been hired as our organization’s first Open Source Developer Advocate. Founded in 2012, NumFOCUS has finally grown beyond just providing non-technical needs for our 40+ sponsored projects! As our first technical hire, Arliss will work to help understand our projects from a technical perspective and […]

The post NumFOCUS hires Open Source Developer Advocate! appeared first on NumFOCUS.

NumFOCUS 2020-12-11 19:37:25

A Pivotal Time in NumFOCUS’s Project Aimed DEI Efforts

NumFOCUS is pleased to announce the launch of our Contributor Diversification & Retention Research Project funded by a grant from the Gordon and Betty Moore Foundation.  “We were eager to support NumFOCUS’s diversity initiative because it aims to get to the heart of what is preventing greater participation in data science. We are hopeful that […]

The post A Pivotal Time in NumFOCUS’s Project Aimed DEI Efforts appeared first on NumFOCUS.