Take a look at our Data Processing books. Shulph carries a great selection of Data Processing books, and we are always adding more.
Explore distributed ledger technology, decentralization, and smart contracts and develop real-time decentralized applications with Ethereum and Solidity Key Features Get to grips with the underlying technical principles and implementations of blockchain Build powerful applications using Ethereum to secure transactions and create smart contracts Gain advanced insights into cryptography and cryptocurrencies Book Description Blockchain technology is a distributed ledger with applications in industries such as finance, government, and media. This Learning Path is your guide to building blockchain networks using Ethereum, JavaScript, and Solidity. You will get started by understanding the technical foundations of blockchain technology, including distributed systems, cryptography and how this digital ledger keeps data secure. Further into the chapters, you'll gain insights into developing applications using Ethereum and Hyperledger. As you build on your knowledge of Ether security, mining , smart contracts, and Solidity, you'll learn how to create robust and secure applications that run exactly as programmed without being affected by fraud, censorship, or third-party interference. Toward the concluding chapters, you'll explore how blockchain solutions can be implemented in applications such as IoT apps, in addition to its use in currencies. The Learning Path will also highlight how you can increase blockchain scalability and even discusses the future scope of this fascinating and powerful technology. By the end of this Learning Path, you'll be equipped with the skills you need to tackle pain points encountered in the blockchain life cycle and confidently design and deploy decentralized applications. This Learning Path includes content from the following Packt products: Mastering Blockchain - Second Edition by Imran Bashir Building Blockchain Projects by Narayan Prusty What you will learn Understand why decentralized applications are important Discover the mechanisms behind bitcoin and alternative cryptocurrencies Master how cryptography is used to secure data with the help of examples Maintain, monitor, and manage your blockchain solutions Create Ethereum wallets Explore research topics and the future scope of blockchain technology Who this book is for This Learning Path is designed for blockchain developers who want to build decentralized applications and smart contracts from scratch using Hyperledger. Basic familiarity with any programming language will be useful to get started with this Learning Path.
Master the intricacies of Elasticsearch 7.0 and use it to create flexible and scalable search solutions Key Features Master the latest distributed search and analytics capabilities of Elasticsearch 7.0 Perform searching, indexing, and aggregation of your data at scale Discover tips and techniques for speeding up your search query performance Book Description Building enterprise-grade distributed applications and executing systematic search operations call for a strong understanding of Elasticsearch and expertise in using its core APIs and latest features. This book will help you master the advanced functionalities of Elasticsearch and understand how you can develop a sophisticated, real-time search engine confidently. In addition to this, you'll also learn to run machine learning jobs in Elasticsearch to speed up routine tasks. You'll get started by learning to use Elasticsearch features on Hadoop and Spark and make search results faster, thereby improving the speed of query results and enhancing the customer experience. You'll then get up to speed with performing analytics by building a metrics pipeline, defining queries, and using Kibana for intuitive visualizations that help provide decision-makers with better insights. The book will later guide you through using Logstash with examples to collect, parse, and enrich logs before indexing them in Elasticsearch. By the end of this book, you will have comprehensive knowledge of advanced topics such as Apache Spark support, machine learning using Elasticsearch and scikit-learn, and real-time analytics, along with the expertise you need to increase business productivity, perform analytics, and get the very best out of Elasticsearch. What you will learn Pre-process documents before indexing in ingest pipelines Learn how to model your data in the real world Get to grips with using Elasticsearch for exploratory data analysis Understand how to build analytics and RESTful services Use Kibana, Logstash, and Beats for dashboard applications Get up to speed with Spark and Elasticsearch for real-time analytics Explore the basics of Spring Data Elasticsearch, and understand how to index, search, and query in a Spring application Who this book is for This book is for Elasticsearch developers and data engineers who want to take their basic knowledge of Elasticsearch to the next level and use it to build enterprise-grade distributed search applications. Prior experience of working with Elasticsearch will be useful to get the most out of this book.
Learn how to use artificial intelligence for product and service innovation, including the diverse use cases of Commerce.AIKey FeaturesLearn how to integrate data and AI in your innovation workflowsUnlock insights into how various industries are using AI for innovationApply your knowledge to real innovation use cases like product strategy and market intelligenceBook DescriptionCommerce.AI is a suite of artificial intelligence (AI) tools, trained on over a trillion data points, to help businesses build next-gen products and services. If you want to be the best business on the block, using AI is a must.Developers and analysts working with AI will be able to put their knowledge to work with this practical guide. You'll begin by learning the core themes of new product and service innovation, including how to identify market opportunities, come up with ideas, and predict trends. With plenty of use cases as reference, you'll learn how to apply AI for innovation, both programmatically and with Commerce.AI. You'll also find out how to analyze product and service data with tools such as GPT-J, Python pandas, Prophet, and TextBlob. As you progress, you'll explore the evolution of commerce in AI, including how top businesses today are using AI. You'll learn how Commerce.AI merges machine learning, product expertise, and big data to help businesses make more accurate decisions. Finally, you'll use the Commerce.AI suite for product ideation and analyzing market trends.By the end of this artificial intelligence book, you'll be able to strategize new product opportunities by using AI, and also have an understanding of how to use Commerce.AI for product ideation, trend analysis, and predictions.What you will learnFind out how machine learning can help you identify new market opportunitiesUnderstand how to use consumer data to create new products and servicesUse state-of-the-art AI frameworks and tools for data analysisLaunch, track, and improve products and services with AIRise above the competition with unparalleled insights from AITurn customer touchpoints into business winsGenerate high-conversion product and service copyWho this book is forThis AI book is for AI developers, data scientists, data product managers, analysts, and consumer insights professionals. The book will guide you through the process of product and service innovation, no matter your pre-existing skillset.
Build efficient data flow and machine learning programs with this flexible, multi-functional open-source cluster-computing framework Key Features Master the art of real-time big data processing and machine learning Explore a wide range of use-cases to analyze large data Discover ways to optimize your work by using many features of Spark 2.x and Scala Book Description Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform. You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools. By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle. This Learning Path includes content from the following Packt products: Mastering Apache Spark 2.x by Romeo Kienzler Scala and Spark for Big Data Analytics by Md. Rezaul Karim, Sridhar Alla Apache Spark 2.x Machine Learning Cookbook by Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen MeiCookbook What you will learn Get to grips with all the features of Apache Spark 2.x Perform highly optimized real-time big data processing Use ML and DL techniques with Spark MLlib and third-party tools Analyze structured and unstructured data using SparkSQL and GraphX Understand tuning, debugging, and monitoring of big data applications Build scalable and fault-tolerant streaming applications Develop scalable recommendation engines Who this book is for If you are an intermediate-level Spark developer looking to master the advanced capabilities and use-cases of Apache Spark 2.x, this Learning Path is ideal for you. Big data professionals who want to learn how to integrate and use the features of Apache Spark and build a strong big data pipeline will also find this Learning Path useful. To grasp the concepts explained in this Learning Path, you must know the fundamentals of Apache Spark and Scala.
A solution-based guide to put your deep learning models into production with the power of Apache Spark Key Features Discover practical recipes for distributed deep learning with Apache Spark Learn to use libraries such as Keras and TensorFlow Solve problems in order to train your deep learning models on Apache Spark Book Description With deep learning gaining rapid mainstream adoption in modern-day industries, organizations are looking for ways to unite popular big data tools with highly efficient deep learning libraries. As a result, this will help deep learning models train with higher efficiency and speed. With the help of the Apache Spark Deep Learning Cookbook, you'll work through specific recipes to generate outcomes for deep learning algorithms, without getting bogged down in theory. From setting up Apache Spark for deep learning to implementing types of neural net, this book tackles both common and not so common problems to perform deep learning on a distributed environment. In addition to this, you'll get access to deep learning code within Spark that can be reused to answer similar problems or tweaked to answer slightly different problems. You will also learn how to stream and cluster your data with Spark. Once you have got to grips with the basics, you'll explore how to implement and deploy deep learning models, such as Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN) in Spark, using popular libraries such as TensorFlow and Keras. By the end of the book, you'll have the expertise to train and deploy efficient deep learning models on Apache Spark. What you will learn Set up a fully functional Spark environment Understand practical machine learning and deep learning concepts Apply built-in machine learning libraries within Spark Explore libraries that are compatible with TensorFlow and Keras Explore NLP models such as Word2vec and TF-IDF on Spark Organize dataframes for deep learning evaluation Apply testing and training modeling to ensure accuracy Access readily available code that may be reusable Who this book is for If you're looking for a practical and highly useful resource for implementing efficiently distributed deep learning models with Apache Spark, then the Apache Spark Deep Learning Cookbook is for you. Knowledge of the core machine learning concepts and a basic understanding of the Apache Spark framework is required to get the best out of this book. Additionally, some programming knowledge in Python is a plus.
Understand the Blockchain revolution and get to grips with Ethereum, Hyperledger Fabric, and Corda.Key FeaturesResolve common challenges and problems faced in the Blockchain domainStudy architecture, concepts, terminologies, and DappsMake smart choices using Blockchain for personal and business investmentsBook DescriptionBlockchain Quick Reference takes you through the electrifying world of blockchain technology and is designed for those who want to polish their existing knowledge regarding the various pillars of the blockchain ecosystem.This book is your go-to guide, teaching you how to apply principles and ideas for making your life and business better. You will cover the architecture, Initial Coin Offerings (ICOs), tokens, smart contracts, and terminologies of the blockchain technology, before studying how they work. All you need is a curious mind to get started with blockchain technology. Once you have grasped the basics, you will explore components of Ethereum, such as ether tokens, transactions, and smart contracts, in order to build simple Dapps. You will then move on to learning why Solidity is used specifically for Ethereum-based projects, followed by exploring different types of blockchain with easy-to-follow examples. All this will help you tackle challenges and problems. By the end of this book, you will not only have solved current and future problems relating to blockchain technology but will also be able to build efficient decentralized applications.What you will learnUnderstand how blockchain architecture components workAcquaint yourself with cryptography and the mechanics behind blockchainApply consensus protocol to determine the business sustainabilityUnderstand what ICOs and crypto-mining are and how they workCreate cryptocurrency wallets and coins for transaction mechanismsUnderstand the use of Ethereum for smart contract and DApp developmentWho this book is forBlockchain Quick Reference is for you if you are a developer who wants to get well-versed with blockchain and its associated concepts and terminologies. You will explore the working mechanism of a decentralized application with the help of examples. Business leaders and blockchain enthusiasts will also find this book useful, as it will help you effectively address challenges and make better personal and business investments.
Propose a new scalable data architecture paradigm, Data Lakehouse, that addresses the limitations of current data architecture patternsKey FeaturesUnderstand how data is ingested, stored, served, governed, and secured for enabling data analyticsExplore a practical way to implement Data Lakehouse using cloud computing platforms like AzureCombine multiple architectural patterns based on an organization's needs and maturity levelBook DescriptionThe Data Lakehouse architecture is a new paradigm that enables large-scale analytics. This book will guide you in developing data architecture in the right way to ensure your organization's success.The first part of the book discusses the different data architectural patterns used in the past and the need for a new architectural paradigm, as well as the drivers that have caused this change. It covers the principles that govern the target architecture, the components that form the Data Lakehouse architecture, and the rationale and need for those components. The second part deep dives into the different layers of Data Lakehouse. It covers various scenarios and components for data ingestion, storage, data processing, data serving, analytics, governance, and data security. The book's third part focuses on the practical implementation of the Data Lakehouse architecture in a cloud computing platform. It focuses on various ways to combine the Data Lakehouse pattern to realize macro-patterns, such as Data Mesh and Data Hub-Spoke, based on the organization's needs and maturity level. The frameworks introduced will be practical and organizations can readily benefit from their application.By the end of this book, you'll clearly understand how to implement the Data Lakehouse architecture pattern in a scalable, agile, and cost-effective manner.What you will learnUnderstand the evolution of the Data Architecture patterns for analyticsBecome well versed in the Data Lakehouse pattern and how it enables data analyticsFocus on methods to ingest, process, store, and govern data in a Data Lakehouse architectureLearn techniques to serve data and perform analytics in a Data Lakehouse architectureCover methods to secure the data in a Data Lakehouse architectureImplement Data Lakehouse in a cloud computing platform such as AzureCombine Data Lakehouse in a macro-architecture pattern such as Data MeshWho this book is forThis book is for data architects, big data engineers, data strategists and practitioners, data stewards, and cloud computing practitioners looking to become well-versed with modern data architecture patterns to enable large-scale analytics. Basic knowledge of data architecture and familiarity with data warehousing concepts are required.
Understand data in a simple way using a data lake. Key Features - In-depth practical demonstration of Hadoop/Yarn concepts with numerous examples. - Includes graphical illustrations and visual explanations for Hadoop commands and parameters. - Includes details of dimensional modeling and Data Vault modeling. - Includes details of how to create and define a structure to a data lake. Description The book 'Data Processing and Modeling with Hadoop' explains how a distributed system works and its benefits in the big data era in a straightforward and clear manner. After reading the book, you will be able to plan and organize projects involving a massive amount of data. The book describes the standards and technologies that aid in data management and compares them to other technology business standards. The reader receives practical guidance on how to segregate and separate data into zones, as well as how to develop a model that can aid in data evolution. It discusses security and the measures that are utilized to reduce the impact of security. Self-service analytics, Data Lake, Data Vault 2.0, and Data Mesh are discussed in the book. After reading this book, the reader will have a thorough understanding of how to structure a data lake, as well as the ability to plan, organize, and carry out the implementation of a data-driven business with full governance and security. What you will learn - Learn the basics of components to the Hadoop Ecosystem. - Understand the structure, files, and zones of a Data Lake. - Learn to implement the security part of the Hadoop Ecosystem. - Learn to work with the Data Vault 2.0 modeling. - Learn to develop a strategy to define good governance. - Learn new tools to work with Data and Big Data Who this book is for This book caters to big data developers, technical specialists, consultants, and students who want to build good proficiency in big data. Knowing basic SQL concepts, modeling, and development would be good, although not mandatory. Table of Contents 1. Understanding the Current Moment 2. Defining the Zones 3. The Importance of Modeling 4. Massive Parallel Processing 5. Doing ETL/ELT 6. A Little Governance 7. Talking About Security 8. What Are the Next Steps?
Gain hands-on experience with industry-standard data analysis and machine learning tools in Python Key Features Learn techniques to use data to identify the exact problem to be solved Visualize data using different graphs Identify how to select an appropriate algorithm for data extraction Book Description Data Science Projects with Python is designed to give you practical guidance on industry-standard data analysis and machine learning tools in Python, with the help of realistic data. The book will help you understand how you can use pandas and Matplotlib to critically examine a dataset with summary statistics and graphs, and extract the insights you seek to derive. You will continue to build on your knowledge as you learn how to prepare data and feed it to machine learning algorithms, such as regularized logistic regression and random forest, using the scikit-learn package. You'll discover how to tune the algorithms to provide the best predictions on new and, unseen data. As you delve into later chapters, you'll be able to understand the working and output of these algorithms and gain insight into not only the predictive capabilities of the models but also their reasons for making these predictions. By the end of this book, you will have the skills you need to confidently use various machine learning algorithms to perform detailed data analysis and extract meaningful insights from unstructured data. What you will learn Install the required packages to set up a data science coding environment Load data into a Jupyter Notebook running Python Use Matplotlib to create data visualizations Fit a model using scikit-learn Use lasso and ridge regression to reduce overfitting Fit and tune a random forest model and compare performance with logistic regression Create visuals using the output of the Jupyter Notebook Who this book is for If you are a data analyst, data scientist, or a business analyst who wants to get started with using Python and machine learning techniques to analyze data and predict outcomes, this book is for you. Basic knowledge of computer programming and data analytics is a must. Familiarity with mathematical concepts such as algebra and basic statistics will be useful.
Get the most out of Elasticsearch 7's new features to build, deploy, and manage efficient applications Key Features Discover the new features introduced in Elasticsearch 7 Explore techniques for distributed search, indexing, and clustering Gain hands-on knowledge of implementing Elasticsearch for your enterprise Book Description Elasticsearch is one of the most popular tools for distributed search and analytics. This Elasticsearch book highlights the latest features of Elasticsearch 7 and helps you understand how you can use them to build your own search applications with ease. Starting with an introduction to the Elastic Stack, this book will help you quickly get up to speed with using Elasticsearch. You'll learn how to install, configure, manage, secure, and deploy Elasticsearch clusters, as well as how to use your deployment to develop powerful search and analytics solutions. As you progress, you'll also understand how to troubleshoot any issues that you may encounter along the way. Finally, the book will help you explore the inner workings of Elasticsearch and gain insights into queries, analyzers, mappings, and aggregations as you learn to work with search results. By the end of this book, you'll have a basic understanding of how to build and deploy effective search and analytics solutions using Elasticsearch. What you will learn Install Elasticsearch and use it to safely store data and retrieve it when needed Work with a variety of analyzers and filters Discover techniques to improve search results in Elasticsearch Understand how to perform metric and bucket aggregations Implement best practices for moving clusters and applications to production Explore various techniques to secure your Elasticsearch clusters Who this book is for This book is for software developers, engineers, data architects, system administrators, and anyone who wants to get up and running with Elasticsearch 7. No prior experience with Elasticsearch is required.