文档章节

人工智能资料库:第54辑(20170515)

AllenOR灵感
 AllenOR灵感
发布于 2017/09/10 01:20
字数 722
阅读 6
收藏 0

1.【博客】Handling imbalanced dataset in supervised learning using family of SMOTE algorithm

简介:

Consider a problem where you are working on a machine learning classification problem. You get an accuracy of 98% and you are very happy. But that happiness doesn’t last long when you look at the confusion matrix and realize that majority class is 98% of the total data and all examples are classified as majority class. Welcome to the real world of imbalanced data sets!!
Some of the well-known examples of imbalanced data sets are
1 - Fraud detection: where number of fraud cases could be much smaller than non-fraudulent transactions.
2- Prediction of disputed / delayed invoices: where the problem is to predict default / disputed invoices.
3- Predictive maintenance data sets, etc

原文链接:http://www.datasciencecentral.com/profiles/blogs/handling-imbalanced-data-sets-in-supervised-learning-using-family


2.【资料】Top 15 Python Libraries for Data Science in 2017

简介:


Core Libraries.

  1. NumPy
  2. SciPy
  3. Pandas

Visualization.

  1. Matplotlib
  2. Seaborn
  3. Bokeh
  4. Plotly

Machine Learning.

  1. SciKit-Learn

Deep Learning - Keras / TensorFlow / Theano

  1. Theano
  2. TensorFlow

原文链接:https://activewizards.com/blog/top-15-libraries-for-data-science-in-python/


3.【论文】Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent

简介:

With the increasing ability to routinely and rapidly digitize whole slide images with slide scanners, there has been interest in developing computerized image analysis algorithms for automated detection of disease extent from digital pathology images. The manual identification of presence and extent of breast cancer by a pathologist is critical for patient management for tumor staging and assessing treatment response. However, this process is tedious and subject to inter- and intra-reader variability. For computerized methods to be useful as decision support tools, they need to be resilient to data acquired from different sources, different staining and cutting protocols and different scanners. The objective of this study was to evaluate the accuracy and robustness of a deep learning-based method to automatically identify the extent of invasive tumor on digitized images. Here, we present a new method that employs a convolutional neural network for detecting presence of invasive tumor on whole slide images. Our approach involves training the classifier on nearly 400 exemplars from multiple different sites, and scanners, and then independently validating on almost 200 cases from The Cancer Genome Atlas. Our approach yielded a Dice coefficient of 75.86%, a positive predictive value of 71.62% and a negative predictive value of 96.77% in terms of pixel-by-pixel evaluation compared to manually annotated regions of invasive ductal carcinoma.

原文链接:https://www.nature.com/articles/srep46450


4.【博客】Neural networks for algorithmic trading 1.2 — Correct time series forecasting + backtesting

简介:


Hi everyone! Some time ago I published a small tutorial on financial time series forecasting which was interesting, but in some moments wrong. I have spent some time working with different time series of different nature (applying NNs mostly) in HPA, that particularly focuses on financial analytics, and in this post I want to describe more correct way of working with financial data. Comparing to previous post, I want to show different way of data normalizing and discuss more issues of overfitting (which definitely appears while working with data that has stochastic nature). We won’t compare different architectures (CNN, LSTM), you can check them in previous post. But even working only with simple feed-forward neural nets we will see important things. If you want to jump directly to the code — check out IPython Notebook. For Russian speaking readers, it’s a translation of my post here and you can check webinar on backtesting here.

原文链接:https://medium.com/@alexrachnog/neural-networks-for-algorithmic-trading-1-2-correct-time-series-forecasting-backtesting-9776bfd9e589


5.【课程】I Dropped Out of School to Create My Own Data Science Master’s — Here’s My Curriculum

简介:

I dropped out of a top computer science program to teach myself data science using online resources like Udacity, edX, and Coursera. The decision was not difficult. I could learn the content I wanted to faster, more efficiently, and for a fraction of the cost. I already had a university degree and, perhaps more importantly, I already had the university experience. Paying $30K+ to go back to school seemed irresponsible.

原文链接:https://medium.com/@davidventuri/i-dropped-out-of-school-to-create-my-own-data-science-master-s-here-s-my-curriculum-1b400dcee412


本文转载自:http://www.jianshu.com/p/7b9b84500929

共有 人打赏支持
AllenOR灵感
粉丝 11
博文 2635
码字总数 83001
作品 0
程序员
私信 提问
2018谷歌学术影响力排名出炉:CVPR进入前20,ResNet被引最多过万次!

【新智元导读】谷歌学术昨天发表了2018年最新的学术期刊和会议影响力排名,CVPR和NIPS分别排名第20和第54。在排名第一的Nature里,过去5年被引用次数最高的论文,正是深度学习三大神Hinton、...

技术小能手
2018/08/06
0
0
预测流行偏好,时尚 AI 未来可望取代造型师

【Technews科技新报】预测时尚潮流是一项需要天分的工作,还得仰赖一个庞大的系统让少数人追捧的时尚进入大众流行市场,进而让业者赚取大笔钞票。现在预测工作也可以交给人工智能,让服饰业者...

黄 嬿
2017/12/26
0
0
人工智能资料库:第72辑(20171203)

1.【会议】Bayesian Deep Learning 简介: While deep learning has been revolutionary for machine learning, most modern deep learning models cannot represent their uncertainty nor......

chen_h
2017/12/03
0
0
国行版HomePod售价2799元,本周五发售

(图片源自苹果中国官网截图) 整理 | 一一 出品 | AI科技大本营 去年 12 月,苹果表示将于今年年初在中国销售其 HomePod 智能音箱。1 月 14 日,苹果公司正式宣布,HomePod 将于 1 月 18 日...

AI科技大本营
01/14
0
0
人工智能知识整理-第1辑(20170603)-机器学习入门资源汇总

有一天我忽然忘记了一个函数的用法,于是就上谷歌搜,结果搜出来的竟然是自己写的一篇笔记,上面有很详细的回答。当时感觉是跟另外一个自己进行交流,那一个是刚学完知识,印象还非常深的自己...

人工智豪
2017/06/03
0
0

没有更多内容

加载失败,请刷新页面

加载更多

聊聊flink TableEnvironment的scan操作

序 本文主要研究一下flink TableEnvironment的scan操作 实例 //Scanning a directly registered tableval tab: Table = tableEnv.scan("tableName")//Scanning a table from a registered......

go4it
13分钟前
1
0
JS检测移动端横竖屏的代码

移动端的设备提供了一个事件:orientationChange事件 这个事件是苹果公司为safari中添加的。以便开发人员能够确定用户何时将设备由横向查看切换为纵向查看模式。 在设备旋转的时候,会触发这...

不负好时光
14分钟前
0
0
ArrayList 优化

优化是 基于数据的大小 当数据量过大 (内存能抗住) 性能达到瓶颈才需要针对性的做优化 contain 优化 可以转换为set add 大批量操作 先预估数据量 调用 `ensureCapacity(int minCapacity)` ...

NotFound403
15分钟前
2
0
Lifecycle Aware Data Loading with Architecture Components

In my previous blog post, I talked about how you can use Loaders to load data in a way that automatically handles configuration changes. With the introduction of Architecture Co......

SuShine
15分钟前
0
0
性能测试汇总

服务器带宽测试 iperf测试带宽 wget -c https://codeload.github.com/esnet/iperf/tar.gz/3.1.6 tar zxvf 3.1.6cd iperf*yum install gcc./configure --prefix=/usr/local/iperf#指定......

以谁为师
19分钟前
1
0

没有更多内容

加载失败,请刷新页面

加载更多

返回顶部
顶部