一. Python 编程规范 简明 Python 编程规范 https://blog.csdn.net/gzlaiyonghao/article/details/2834883 Python语言规范 http://zh-google-styleguide.readthedocs.io/en/latest/google-python-styleguide/python_language_rules/ Python风格规范 http://zh-google-styleguide.readthedocs.io/en/latest/google-python-styleguide/python_style_rules/


二.机器学习所需-海量数据集 常用的搜索网站


|UCI Machine Learning Repository http://archive.ics.uci.edu/ml/index.php 最著名的UCI数据集库,许多论文的数据均来源于此。 |AWS Public Datasets https://aws.amazon.com/cn/datasets/ 亚马逊云服务提供的数据集,涵盖天文、生物、化学、天气、经济等多领域。 |YAHOO Webscope datasets https://webscope.sandbox.yahoo.com/ 雅虎提供的数据集,包含图像、语言、排名分类等多领域数据。 |Kaggle datasets https://www.kaggle.com/datasets Kaggle竞赛平台提供的数据集库,能在里面发现很多来自工业界有趣的数据, 比如Uber、Netflix Prize、McDonald's等的数据。


计算机视觉 |ImageNet http://www.image-net.org/ 图像处理最著名的数据集,可以根据你的项目需求搜索任一种类的图像,⽤用来 做对象识别,定位,分类和屏幕解析等问题。有14197122个不同尺寸的图像, 总计140GB。 |MNIST http://yann.lecun.com/exdb/mnist/ 基本上是新提出的机器学习算法必跑的一个数据集。MNIST是一个手写数字数 据库,它有60000个训练样本集和10000个测试样本集,是NIST数据库的一个 子集。 |The CIFAR-10 dataset https://www.cs.toronto.edu/~kriz/cifar.html 32x32 彩×××像。 |Google Open Images https://github.com/ejlb/google-open-image-download Google Open Images 是Google公司开放的大型图像标注数据集,包含 900万 张图像中 7800种类别内容的标注。


自然语言处理 |文本分类数据集 https://drive.google.com/drive/folders/0Bz8a_Dbh9Qhbfll6bVpmNUtUcFdjYmF2SEpmZUZUcVNiMUw1TWN6RDV3a0JHT3kxLVhVR2M 由 DBPedia、Amazon、Yelp、Yahoo!、Sogou 和 AG的文本分类数据整合成 的一个大型数据集。样本大小从 120K 到 3.6M, 问题从 2 级到 14 级。 |WikiText https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset 维基百科文章中的大型语言建模语料库。 |Billion Words http://www.statmt.org/lm-benchmark/ 常用来训练如word2vec或Glove的分布式词表征 |Stanford Sentiment Treebank https://link.zhihu.com/?target=http%253A//nlp.stanford.edu/sentiment/code.html 用于情感分析的数据集


语音识别 |2000 HUB5 English https://catalog.ldc.upenn.edu/LDC2002T43 英语的语音数据。 |CHIME http://spandh.dcs.shef.ac.uk/chime_challenge/data.html 包含噪声的语音识别数据集 |TED-LIUM http://www-lium.univ-lemans.fr/en/content/ted-lium-corpus TED演讲的语音数据集,有对应的全文本。


其它类 |UCR Time Series http://www.cs.ucr.edu/~eamonn/time_series_data/ 时间序列界的“Imagnet”,发文章必跑。 |Million Song Dataset https://labrosa.ee.columbia.edu/millionsong/ 做音乐推荐或分类的程序员可能会用到。 |Netflix 推荐系统数据 http://dataju.cn/Dataju/web/datasetInstanceDetail/32 电影评价数据集,该数据集中包含随机挑选的 48万 Netflix客户,对 1.7万 部 电影,超过 1百万 条评价,数据时间段为 1998.10 到 2005.11。评价以5分制 评分为基准,每部电影评价为1-5分,客户信息进行了脱敏处理。 |Udacity 自动驾驶数据集 https://github.com/udacity/self-driving-car/ Udacity 学城开放的自动驾驶课程中的自动驾驶汽车数据集,旨在打造一个开 源的自动驾驶项目。多个二进制压缩文件,总计100G左右


三.IT名企机器学习岗位最新面经 腾讯 1.【数据挖掘面经】腾讯+百度+华为(均拿到sp offer)--转http:// blog.csdn.net/zhaoyu106/article/details/52853377 2.腾讯 - 机器学习(深圳) http://www.job592.com/pay/ms224054.html

百度 1.【数据挖掘面经】腾讯+百度+华为(均拿到sp offer)--转http:// blog.csdn.net/zhaoyu106/article/details/52853377 2.2016百度‘机器学习/数据挖掘岗位’面经,一面+二面+三面 http://blog.csdn.net/zzukun/article/details/52687842 3.百度 - 机器学习(北京) http://www.job592.com/pay/ms276014.html http://www.job592.com/pay/ms276174.html 4.百度 机器学习/数据挖掘 ⼀一面 被淘汰 记 http://blog.csdn.net/mpbchina/article/details/8018005 5.面经 百度机器学习、自然语言处理NLP http://blog.sina.com.cn/s/blog_8af1069601013hl1.html

阿里巴巴 1.国内互联⽹公司算法&机器学习岗(阿⾥星)面试总结 http://www.100mian.com/mianshi/sousuosuanfa/49035.html 2.阿里实习生的四次面试经历(机器学习) http://www.jianshu.com/p/0a1148bb0c70

华为 1.【数据挖掘⾯经】腾讯+百度+华为(均拿到sp offer)--转http:// blog.csdn.net/zhaoyu106/article/details/52853377

美团 1.美团机器学习岗面经 http://blog.csdn.net/zr459927180/article/details/51966345

⽹易 ⽹易 - 机器学习(⼴州) http://www.job592.com/pay/ms276733.html

GOOGLE 1.Google Data Scientist Interview Questions https://www.glassdoor.com/Interview/Google-Data-Scientist-Interview- Questions-EI_IE9079.0,6_KO7,21.htm 2.Google Software Engineer Machine Learning Interview Questions https://www.glassdoor.com/Interview/Google-Software-Engineer-Machine- Learning-Interview-Questions-EI_IE9079.0,6_KO7,41.htm

FACEBOOK 1.Machine Learning Software Engineer Interview https://www.glassdoor.com/Interview/Facebook-Machine-Learning-Software- Engineer-Interview-Questions-EI_IE40772.0,8_KO9,43.htm 2.acebook Machine Learning Interview Questions https://www.glassdoor.com/Interview/Facebook-Machine-Learning-Interview- Questions-EI_IE40772.0,8_KO9,25.htm 3.Facebook Research Scientist Interview Questions https://www.glassdoor.com/Interview/Facebook-Research-Scientist-Interview- Questions-EI_IE40772.0,8_KO9,27.htm

Amazon 1.Amazon Machine Learning Scientist Interview Questions https://www.glassdoor.com/Interview/Amazon-Machine-Learning-Scientist- Interview-Questions-EI_IE6036.0,6_KO7,33.htm 2.Amazon Applied Scientist Interview Questions https://www.glassdoor.com/Interview/Amazon-Applied-Scientist-Interview- Questions-EI_IE6036.0,6_KO7,24.htm 3.Amazon Machine Learning Interview Questions https://www.glassdoor.com/Interview/Amazon-Machine-Learning-Interview- Questions-EI_IE6036.0,6_KO7,23.htm 4.How should I prepare for an interview with the Amazon machine learning group? https://www.quora.com/How-should-I-prepare-for-an-interview-with-the- Amazon-machine-learning-group 5.Amazon Interview Experience | 220 (On-Campus) http://www.geeksforgeeks.org/amazon-interview-experience-220-on-campus/

LINKEDIN 1.LinkedIn Software Engineer/Machine Learning Interview Questions https://www.glassdoor.com/Interview/LinkedIn-Interview-Questions- E34865.htm?filter.jobTitleExact=Software+Engineer%2FMachine+Learning 2.LinkedIn Data Scientist Interview Questions https://www.glassdoor.com/Interview/LinkedIn-Data-Scientist-Interview- Questions-EI_IE34865.0,8_KO9,23.htm 3.LinkedIn Machine Learning Engineer Interview Questions https://www.glassdoor.com/Interview/LinkedIn-Machine-Learning-Engineer- Interview-Questions-EI_IE34865.0,8_KO9,34.htm

其他公司 1.机器学习岗⾯试点滴聚集http://blog.csdn.net/Andrewseu/article/details/ 53940726 2.Machine Zone ⾯经 OA+Onsite http://www.themianjing.com/2015/06/machine-zone- %E9%9D%A2%E7%BB%8F-oaonsite/ 3.银江 - 机器学习(杭州) http://www.job592.com/pay/ms269483.html 4.独家揭密|来自硅谷机器学习岗位面经! https://zhuanlan.zhihu.com/p/25066733 5.国内互联⽹公司算法&机器学习岗(阿里星)⾯试总结 http://www.pig66.com/weixintoutiao/dianzandang/2016-02-22/636092.html 6.⼀线互联网公司机器学习岗位面试经验 http://blog.csdn.net/shuaishuai3409/article/details/52886453 7.搜狗机器学习算法⼯程师面试经验分享 http://www.yjbys.com/mianshi/692123.html 8.4个月面试25+公司,学长教你找Data Scientist职位! (内含Facebook, Google等15家⼤公司面经) http://posts.careerengine.us/p/57695dc6fba2e4e631c8b043

相关题⽬

Interview/ 4.机器学习面经 http://www.cnblogs.com/xiangzhi/p/4842757.html

6.机器学习面试知识点总结(不断补充中) http://www.cnblogs.com/zuochongyan/p/5407053.html 7.【干货】机器学习常见的算法面试题总结 http://www.cstor.cn/textdetail_10481.html 8.What are some common Machine Learning interview questions? https://www.quora.com/What-are-some-common-Machine-Learning-interviewquestions 9.Machine Learning Interview Questions https://www.glassdoor.com/Interview/machine-learning-interview-questions- SRCH_KO0,16.htm 10.21 Must-Know Machine Learning Interview Questions and Answers https://elitedatascience.com/machine-learning-interview-questions-answers 11.Top 50 Machine Learning Interview Questions & Answers http://career.guru99.com/top-50-interview-questions-on-machine-learning/ 12.ML Job Interview Questions https://www.reddit.com/r/MachineLearning/comments/1wmayh/ ml_job_interview_questions/ 13.Top-down learning path: Machine Learning for Software Engineers https://github.com/ZuzooVn/machine-learning-for-software-engineers 14.What are some good interview questions for statistical algorithm developer candidates? https://stats.stackexchange.com/questions/210639/what-are-some-goodinterview- questions-for-statistical-algorithm-developer-candi 15.Interview Questions for Data Scientist / Research Engineer Positions https://www.linkedin.com/pulse/interview-questions-data-scientist-researchengineer- ahmed-el-deeb