1. 程式人生 > >How do I import historical data? Documentation

How do I import historical data? Documentation

When transitioning over to Segment customers commonly want to import historical data to tools they are migrating to or evaluating.

Note: Historical imports can only be done into destinations that can accept historical timestamp’ed data. Most analytics tools like Mixpanel, Amplitude, Kissmetrics, etc. can handle that type of data just fine. One common destination that does not accept historical data is Google Analytics since their API cannot accept historical data.

Use any server-side library, which send requests in batches to improve performance. Once you have data to import, follow the steps below:

  1. Export or collect the data to be imported.

    Include timestamp data in your export if the data needs to appear in end tools in a historical reference. For instance if you are importing emails and it is relevant when they joined your email list, you may need to export the timestamp. If no timestamp is specified when importing, the data will show a timestamp from the time the data was received

    .

  2. Decide which destinations need to receive the data.

    By default, data coming into Segment will be relayed to all destinations linked to a given source. To limit data to specific destinations, the integrations object must be modified. With historical data, you often only want to send the data to a specific destination or into your data warehouse. For example, in

    Node.js set the integrations object as follows.

analytics.track({ 
    event: 'Upgraded Membership',
    userId: '97234974',
    integrations: { 'All': false, 'Vero': true, 'Google Analytics': false }
 })

Once you’ve done that, you’ll need to write an application or worker to send the data to Segment.

You will need to cycle through each set of data and map it to a Segment server-side library method or build an array matching the HTTP Import API format. Please note, we recommend using a Segment library for this process, as they will set contextual message fields like message_id (used for deduping) and sent_at (used for correctly client clock skew) that our API will use to ensure correct behavior upon ingestion. The server-side libraries will automatically batch requests to optimize for performance and prevent linear request volume. This batching behavior is modifiable. Some of the libraries implement a configurable max queue size that may discard messages if you enqueue requests much faster than the client can flush them. We recommend overriding the max queue size parameter for the library to a high value you’re comfortable you can remain under in your batch job.

One of our Success Engineers wrote an alpha prototype Node.js app for importing data utilizing the HTTP API, which we’ve included below:

If a server-side library doesn’t meet your needs, use the Segment bulk import HTTP API. Please note, if you’re using the HTTP API directly to replay data you’ve exported from Segment, we recommend removing the original sent_at, message_id, and project_id fields from the archived message before forwarding them to Segment.

Our friends at MarketLytics have written up their experience using the alpha prototype importer and offer some helpful visuals and tips.

If you have any questions, or see anywhere we can improve our documentation, please let us know!

相關推薦

How do I import historical data? Documentation

When transitioning over to Segment customers commonly want to import historical data to tools they are migrating to or evaluating.Note: Historical imports

How do I join user profiles? Documentation

One of the first questions we get when our customers start querying all of their data is, how do I join all this data together? For example, let’s say you’

How do I find out my usage data? Documentation

If you have questions about your data usage or how it relates to your bill, we recommend logging into your Segment workspace, clicking on the top left arro

How do I measure my advertising funnel? Documentation

However, it’s surprisingly hard to answer questions about the ROI of your ad campaigns, and many technical marketers aren’t able to dig into the numbers wi

How do I handle common cloud source errors? Documentation

The most common reasons why sources will have trouble is due to authentication or permission issues. When the issue is authentication-related, you'll see a

How do I pick a secure password? Documentation

Picking a strong password is one of the most important things you can do to protect your account.Under the HoodWhen you first create a Segment account, or

How do I add a team member? Documentation

If you are on our Team or Business plan you can add a Team member in your workspace team page and inviting any team members by email. If you are on a Devel

How do I decide between Redshift, Postgres, and BigQuery? Documentation

Comparing Redshift and PostgresIn most cases, you will get a much better price-to-performance ratio with Redshift for typical analyses.Redshift lacks some

How do I find my source slug? Documentation

Your source slug can be found in the URL when you’re looking at the source destinations page or live debugger. The URL structure will look like this:If you

How do I find my write key? Documentation

The write key is a unique identifier for your Source. It lets Segment know which Source is sending the data and therefore which destinations should receive

How do I measure the ROI of my Marketing Campaigns? Documentation

The purpose of marketing campaigns is to drive traffic to your store front. But how do you know which campaigns yield the most conversions or what channel

How do I find what queries were executing in a SQL memory dump?-----stack

been sea under lba bject ecif tool data- mil https://blogs.msdn.microsoft.com/askjay/2010/10/03/how-do-i-find-what-queries-were-execu

ubuntu How do I configure proxies without GUI?

cli pri art lar open config user settings 修改 想法: 我的想法是想是一臺國內的 ubuntu 雲主機可以通過另外一臺在國外(新加坡)的服務器 ,來實現可以訪問 google ,哈哈,比較好查資料:) 下面的做法 去修改 /et

How do I clone a generic list in C#?

code sele listt list ati class ocl list() () static class Extensions { public static IList<T> Clone<T>(this IList<T>

how do I access the EC Embedded Controller firmware level with wmi win32?

Imports System Imports System.Management Imports System.Windows.Forms Namespace WMISample Public Class MyWMIQuery Public Overloads Shar

How do I add a Foreign Key Field to a ModelForm in Django?

What I would like to do is to display a single form that lets the user: Enter a document title (from Document model

Dlib how do I save image

解決c++ - In Dlib how do I save image with overlay?   推薦:how to save a c++ object in java object and use it http://blog.csdn.net/luoshen

How do I interpret scsi status messages in RHEL like "sd 2:0:0:243: SCSI error: return code = 0x0800

Issue What does "return code = 0xNNNNNNNN" mean, for example 0x08000002 within the following: Raw Oct 23 14:56:25 uname kernel: sdas: C

【轉】How do I set the real time scheduling priority of a process?

In the event that a process is not achieving the desired performance performance benchmarks, it can be helpful to set CPU affinity, real

How do I resize an image using PIL and maintain its aspect ratio?

我有一個資料夾,裡的圖檔是 96×96, 我希望在這一個資料夾下的檔案被異動時,會自動產生縮圖(64×64)到其他的資料夾下。 PIL 是 Python 下最有名的影像處理套件。 這個套件,似乎在升級改版本,把一些比較少人用的屬性或方法在新版本裡拿掉,新版本也加入了更多新的功能。一般人應該都只會使用基本