Location>code7788 >text

Getting Started | Datavines Installation and Deployment

Popularity:483 ℃/2024-09-07 10:51:23

Abstract: This article focuses on deploying Datavines and executing check jobs based on source code, and is divided into the following sections:

  • Platform Introduction
  • Rapid deployment
  • Operational data quality check operations

The goal of Datavines is to become a better open source project in the field of data observability, solving problems in metadata management and data quality management for more users. We sincerely welcome more contributors to participate in the community building, grow with us, and work together to build a better community.

/datavane/datavines
/datavane/datavines/issues
/datavane/datavines/pulls


Platform Introduction

Datavines Is a one-stop open source data observability platform, providing metadata management, data overview report, data quality management, data distribution query, data trends insights and other core capabilities, is committed to helping users to fully understand and control the data, so that you do have a clear picture.

Rapid deployment

environmental preparation

installed inDatavines Before doing so, please make sure you have the following software installed on your server

  • GitEnsuregit clonesuccessful implementation of the Convention on the Rights of the Child
  • JDKEnsurejdk >= 8
  • Maven, to ensure the smooth packaging of the project (of course you can also package it locally and upload it to the server later)
  • MySQL, Ensure version>=5.7

Download Code

git clone /datavane/
cd datavines

Database preparation

Datavines The metadata is stored in a relational database that currently supports theMySQL The following is based on theMySQL As an example, the installation procedure is illustrated:

  • Creating a databasedatavines
  • fulfillmentscript/sql/ Script for database initialization

Project construction

Pack and unzip

mvn clean package -Prelease
cd datavines-dist/target
tar -zxvf datavines-1.0.

After unzipping, go to the directory

cd datavines-1.0.0-SNAPSHOT-bin

Editing configuration information

cd conf
vi 

Modify database information

spring:
 datasource:
   driver-class-name: 
   url: jdbc:mysql://127.0.0.1:3306/datavines?useUnicode=true&characterEncoding=UTF-8
   username: root
   password: 123456

If you are using theSpark does the execution engine and is submitted to theyarn to execute it, then it needs to be executed in the Medium Configurationyarn Related information

  • standalone mode
=standalone
=http://%s:%s/ws/v1/cluster/apps/%s # the first %s needs to be replaced with the ip address of yarn
=8088
  • ha mode
=ha
=http://%s:%s/ws/v1/cluster/apps/%s
=8088
=192.168.0.1,192.168.0.2

Starting services

cd bin
sh  start mysql

Check the log, if there is no error message in the log and you can see the
[INFO] 2022-04-10 12:29:05.447 :[61] - Started DatavinesServer in 3.97 seconds (JVM running for 4.69) is used to prove that the service has been successfully started.

Access to front-end pages

Enter it in your browser:Server IP:5600 Then you will be redirected to the login screen, enter your account password.admin/123456

Operational data quality check operations

Creating a Data Source

After entering the home page, click on the upper right cornerCreating a Data Source button, enter the name of the data source, and then select the data source type. Take theMySQL For example, typeMySQL connection information, clicktest connection button. If successful, click thesave (a file etc) (computing)

Access to data sources

Click and go to the data source and find theoperations management web page

Creating checking assignments

  • strike (on the keyboard)Creating Rule Assignments button, selectData quality operations

  • Go to the configuration page of the rule

  • Perform rule configuration

    • optionEnumeration value [not present] checking rules and regulations
    • Select the database, table and column in turn
    • Input enumeration array[0,1]
  • Perform Expectation Configuration

    • If there are no expectations then selectnot have
  • Perform calibration configuration

    • optionactual value Check the formula,> Comparator and input threshold10
    • This constitutes[Actual value > 10] The formula , when the formula holds, indicates that the result of the check is a success, otherwise it is a failure.
  • Perform error data configuration

    • Select Save in source data source and fill in the database you have created.
  • After completing the configuration clickSave and run to perform inspection operations.

Viewing information about a rule job

existJob Listings Locate the inspection job that was just created and executed.

strike (on the keyboard)Record of implementation page, you can see the execution history list.

strike (on the keyboard)log (computing) button, you can see the log information of the rule execution.

strike (on the keyboard)in the end button, you can see the results of the rule execution check.

strike (on the keyboard)error message button, you can see the error data of the rule execution.

concluding remarks

This article describes in detail the Datavines platform deployment installation to run the entire process, each link illustrated, I believe that many partners are eager to try, move it, more exciting waiting for you to dig.

About Datavane

Datavane is an open source organization (community) focusing on the field of big data, co-founded by a group of outstanding open source project authors in the field of big data, aiming to help open source project authors to better build projects and provide high-quality open source software for the public, with the purpose of: just to make a good software. Currently has gathered a number of high-quality open source projects related to data integration, big data component management, data quality and so on.

In the Datavane community, all projects are open source and open to potential projects with quality code and architectural design. The community maintains open neutrality, collaborative creativity, and adherence to excellence, and encourages all developers, users, and contributors to actively participate in our community, work together, innovate, and create to build a stronger open source community.

Official website./
Github : /datavane