데이터 과학은 기계 학습, 통계, 고급 분석, 프로그래밍을 한 데 모은 학제간 분야입니다. 인지의 시대에 겉으로 드러나지 않는 인사이트를 파악하고 데이터를 작동하게 하는 새로운 형태의 학문입니다.
IBM Data Science Experience (DSX) is an enterprise platform for data scientists and data engineers. It offers out-of-the-box open-source and commercial data science tools including RStudio, Apache Spark, Jupyter, and Zeppelin notebooks. DSX supports the entire data science lifecycle from data preparation and ETL to model development and deployment. With DSX, companies can build predictive and machine learning models using their favorite tools, technologies, and libraries, while leveraging the scale, security and governance of the HDP platform.
데이터 과학 라이프사이클
Access to community
DSX provides a social environment where data scientists can research and share articles, data sets, notebooks, and tutorials. DSX enables data scientists and analysts to come up to speed by taking courses in R, Python, or Scala, copy content into a Jupyter or a Zeppelin notebook, or work in an embedded RStudio environment.
With DSX, data scientists have the flexibility to create new Jupyter or Zeppelin notebooks in R, Python, or Scala or import an existing notebook. DSX includes popular open source libraries, such as PySpark, matplotlib, SparkML and machine learning and deep learning APIs. Data scientists can use DSX to tell a compelling story with the help of open source visualization libraries like Brunel and PixieDust and have the flexibility to install other open source libraries of their choice.
Code in Scala, Python, R, Apache Spark and SQL
Visualize and share code using Zeppelin & Jupyter Notebooks
Leverage RStudio IDE and Shiny
Use your favorite libraries including Scikit-learn, XGBoost, Spark Mlib, TensorFlow, Caffe, Keras and MXNet
The combination of HDP and DSX empowers enterprises to run data science at scale by leveraging all the data in the data lake, as well as deploying enterprise-grade security, governance, and operations.
Data Science at Scale - Run Spark Jobs on HDP Cluster
Apache, Hadoop, Falcon, Atlas, Tez, Sqoop, Flume, Kafka, Pig, Hive, HBase, Accumulo, Storm, Solr, Spark, Ranger, Knox, Ambari, ZooKeeper, Oozie, Phoenix, NiFi, Nifi Registry, HAWQ, Zeppelin, Slider, Mahout, MapReduce, HDFS, YARN, Metron and the Hadoop elephant and Apache project logos are either registered trademarks or trademarks of the Apache Software Foundation in the United States or other countries.