文档章节

人工智能资料库:第54辑(20170515)

AllenOR灵感
 AllenOR灵感
发布于 2017/09/10 01:20
字数 722
阅读 4
收藏 0

1.【博客】Handling imbalanced dataset in supervised learning using family of SMOTE algorithm

简介:

Consider a problem where you are working on a machine learning classification problem. You get an accuracy of 98% and you are very happy. But that happiness doesn’t last long when you look at the confusion matrix and realize that majority class is 98% of the total data and all examples are classified as majority class. Welcome to the real world of imbalanced data sets!!
Some of the well-known examples of imbalanced data sets are
1 - Fraud detection: where number of fraud cases could be much smaller than non-fraudulent transactions.
2- Prediction of disputed / delayed invoices: where the problem is to predict default / disputed invoices.
3- Predictive maintenance data sets, etc

原文链接:http://www.datasciencecentral.com/profiles/blogs/handling-imbalanced-data-sets-in-supervised-learning-using-family


2.【资料】Top 15 Python Libraries for Data Science in 2017

简介:


Core Libraries.

  1. NumPy
  2. SciPy
  3. Pandas

Visualization.

  1. Matplotlib
  2. Seaborn
  3. Bokeh
  4. Plotly

Machine Learning.

  1. SciKit-Learn

Deep Learning - Keras / TensorFlow / Theano

  1. Theano
  2. TensorFlow

原文链接:https://activewizards.com/blog/top-15-libraries-for-data-science-in-python/


3.【论文】Accurate and reproducible invasive breast cancer detection in whole-slide images: A Deep Learning approach for quantifying tumor extent

简介:

With the increasing ability to routinely and rapidly digitize whole slide images with slide scanners, there has been interest in developing computerized image analysis algorithms for automated detection of disease extent from digital pathology images. The manual identification of presence and extent of breast cancer by a pathologist is critical for patient management for tumor staging and assessing treatment response. However, this process is tedious and subject to inter- and intra-reader variability. For computerized methods to be useful as decision support tools, they need to be resilient to data acquired from different sources, different staining and cutting protocols and different scanners. The objective of this study was to evaluate the accuracy and robustness of a deep learning-based method to automatically identify the extent of invasive tumor on digitized images. Here, we present a new method that employs a convolutional neural network for detecting presence of invasive tumor on whole slide images. Our approach involves training the classifier on nearly 400 exemplars from multiple different sites, and scanners, and then independently validating on almost 200 cases from The Cancer Genome Atlas. Our approach yielded a Dice coefficient of 75.86%, a positive predictive value of 71.62% and a negative predictive value of 96.77% in terms of pixel-by-pixel evaluation compared to manually annotated regions of invasive ductal carcinoma.

原文链接:https://www.nature.com/articles/srep46450


4.【博客】Neural networks for algorithmic trading 1.2 — Correct time series forecasting + backtesting

简介:


Hi everyone! Some time ago I published a small tutorial on financial time series forecasting which was interesting, but in some moments wrong. I have spent some time working with different time series of different nature (applying NNs mostly) in HPA, that particularly focuses on financial analytics, and in this post I want to describe more correct way of working with financial data. Comparing to previous post, I want to show different way of data normalizing and discuss more issues of overfitting (which definitely appears while working with data that has stochastic nature). We won’t compare different architectures (CNN, LSTM), you can check them in previous post. But even working only with simple feed-forward neural nets we will see important things. If you want to jump directly to the code — check out IPython Notebook. For Russian speaking readers, it’s a translation of my post here and you can check webinar on backtesting here.

原文链接:https://medium.com/@alexrachnog/neural-networks-for-algorithmic-trading-1-2-correct-time-series-forecasting-backtesting-9776bfd9e589


5.【课程】I Dropped Out of School to Create My Own Data Science Master’s — Here’s My Curriculum

简介:

I dropped out of a top computer science program to teach myself data science using online resources like Udacity, edX, and Coursera. The decision was not difficult. I could learn the content I wanted to faster, more efficiently, and for a fraction of the cost. I already had a university degree and, perhaps more importantly, I already had the university experience. Paying $30K+ to go back to school seemed irresponsible.

原文链接:https://medium.com/@davidventuri/i-dropped-out-of-school-to-create-my-own-data-science-master-s-here-s-my-curriculum-1b400dcee412


本文转载自:http://www.jianshu.com/p/7b9b84500929

共有 人打赏支持
AllenOR灵感
粉丝 10
博文 2634
码字总数 82983
作品 0
程序员
预测流行偏好,时尚 AI 未来可望取代造型师

【Technews科技新报】预测时尚潮流是一项需要天分的工作,还得仰赖一个庞大的系统让少数人追捧的时尚进入大众流行市场,进而让业者赚取大笔钞票。现在预测工作也可以交给人工智能,让服饰业者...

黄 嬿
2017/12/26
0
0
2018谷歌学术影响力排名出炉:CVPR进入前20,ResNet被引最多过万次!

【新智元导读】谷歌学术昨天发表了2018年最新的学术期刊和会议影响力排名,CVPR和NIPS分别排名第20和第54。在排名第一的Nature里,过去5年被引用次数最高的论文,正是深度学习三大神Hinton、...

技术小能手
08/06
0
0
人工智能知识整理-第1辑(20170603)-机器学习入门资源汇总

有一天我忽然忘记了一个函数的用法,于是就上谷歌搜,结果搜出来的竟然是自己写的一篇笔记,上面有很详细的回答。当时感觉是跟另外一个自己进行交流,那一个是刚学完知识,印象还非常深的自己...

人工智豪
2017/06/03
0
0
区块链技术让科学家共享患者健康资讯,同时保障个人资料安全

【Technews科技新报】目前医生在依据乳房摄影判断乳癌发生的情况下,有四分之一的乳癌无法被及时判断发现。为了提升乳癌确诊的效率,科学家计划以数百万包含了健康女性以及患有乳癌的女性乳房...

黄 斯沛
04/16
0
0
人工智能时代的工作、学习和生活---《人工智能》阅读笔记

自从“罗辑思维”栏目从优酷网站搬到得到APP并且变为每天几分钟的节目之后,我就很少收听它了。某天,我打开得到APP,并且点开了“罗辑思维”的节目清单,发现有一期的标题包含了“人工智能”...

zhouzxi
2017/07/15
0
0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

OSChina 周日乱弹 —— 种族不同,禁止交往

Osc乱弹歌单(2018)请戳(这里) 【今日歌曲】 @小小编辑:推荐歌曲《苏菲小姐》- 鱼果 《苏菲小姐》- 鱼果 手机党少年们想听歌,请使劲儿戳(这里) @貓夏:下大雨 正是睡觉的好时候 临睡前...

小小编辑
今天
211
6
Python 搭建简单服务器

Python动态服务器网页(需要使用WSGI接口),基本实现步骤如下: 1.等待客户端的链接,服务器会收到一个http协议的请求数据报 2.利用正则表达式对这个请求数据报进行解析(请求方式、提取出文...

代码打碟手
今天
1
0
Confluence 6 删除垃圾内容

属性(profile)垃圾 属性垃圾的定义为,一个垃圾用户在 Confluence 创建了用户,但是这个用户在自己的属性页面中添加了垃圾 URL。 如果你有很多垃圾用户在你的系统中创建了属性,你可以使用...

honeymose
今天
0
0
qduoj~前端~二次开发~打包docker镜像并上传到阿里云容器镜像仓库

上一篇文章https://my.oschina.net/finchxu/blog/1930017记录了怎么在本地修改前端,现在我要把我的修改添加到部署到本地的前端的docker容器中,然后打包这个容器成为一个本地镜像,然后把这...

虚拟世界的懒猫
今天
1
0
UML中 的各种符号含义

Class Notation A class notation consists of three parts: Class Name The name of the class appears in the first partition. Class Attributes Attributes are shown in the second par......

hutaishi
今天
1
0

没有更多内容

加载失败,请刷新页面

加载更多

下一页

返回顶部
顶部