1.【博客】Example of TensorFlows new Input Pipeline
During the time I wrote my last article about finetuning AlexNet with TensorFlow I read about the best practices for optimizing performance in TensorFlow. There are several things I made different to these practices but I think the one that had the biggest effect on the performance was everything around the input pipeline. With the new Version of TensorFlow the Dataset API was introduced and provides us with a good and relatively easy way to write our own input pipeline using nothing more than TensorFlow. While you can potentially use this Datasets for any kind of input data, I will use images for the use case of this article. By the end of this article you will hopefully be able to use the new Dataset API for you own project and decrease the computation time needed to train you model.
2.【博客】Random Forest – Supervised classification machine learning algorithm
Random Forest is the go to machine learning algorithm that works through bagging approach to create a bunch of decision trees with a random subset of the data. It is considered to be one of the most effective algorithm to solve almost any prediction task. It can be used both for classification and the regression kind of problems. It is a combination of tree predictors where each tree depends on the values of a random vector sampled independently with the same distribution for all trees in the forest.
This is the code used for the experiments in the paper "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks". The sentiment analysis experiment relies on a fork of keras which implements Bayesian LSTM, Bayesian GRU, embedding dropout, and MC dropout. The language model experiment extends wojzaremba's lua code.
5.【博客】Text Clustering : Get quick insights from Unstructured
In this two-part series, we will explore text clustering and how to get insights from unstructured data. It will be quite powerful and industrial strength. The first part will focus on the motivation. The second part will be about implementation.
This post is the first part of the two-part series on how to get insights from unstructured data using text clustering. We will build this in a very modular way so that it can be applied to any dataset. Moreover, we will also focus on exposing the functionalities as an API so that it can serve as a plug and play model without any disruptions to the existing systems.
- Text Clustering: How to get quick insights from Unstructured Data – Part 1: The Motivation
- Text Clustering: How to get quick insights from Unstructured Data – Part 2: The Implementation