Category Encoders v1.2.5 Release

This release was actually cut a couple of weeks ago, but I forgot to put a post here. It's been a release of mainly incremental changes, but also one of increased contributions from the community, so while not a huge feature-packed release, it's one I'm particularly proud of.  Here's to more like this.

It was around 4 months since the last release, which I think is a pretty decent cadence, considering our level of development.

Some highlights:

  • Andrethrill did some work to make the usage of binary encoding more stable when training/transforming on datasets with different counts of categories
  • The same thing got done in BaseNEncoder
  • Cameron Davison updated the type coercion code for Pandas DataFrames was changed to quiet some deprecation warnings.
  • Cameron Davison also did some work to ensure consistent ordering of categories in the ordinal encoder, and the encoders which use it.
  • HBGHHY added leave-one-out encoding, a new method for us, found on Kaggle.

So if you haven't used it already, check out category encoders, it's great. If you do use it and like it, hop on over to github and join us, there's always something new to work on.

https://github.com/scikit-learn-contrib/categorical-encoding

Will

Will has a background in Mechanical Engineering from Auburn, but mostly just writes software now. He was the first employee at Predikto, and is currently building out the premiere platform for predictive maintenance in heavy industry there as Chief Scientist. When not working on that, he is generally working on something related to python, data science or cycling.

Leave a Reply