Latest update

- Can unlimited liability companies be present at the stock exchange?
- Error message when displaying algorithm
- Extra blank line at the end of a paragraph (reledpar / reledmac)
- Examples of luatex callback with lualatex
- bad format of an array using latex
- Tiny, tightly spaced, unnumbered footnotes in beamer presentation (possibly with columns)
- \gls in a special environment
- good way to horizontally align enumerate items
- Unicode encoding problem in ogo2ogr GeoJSON export from PostGIS
- Geoserver polygon labeling Arabic numbers using SLD
- Random selection of points in SpatialPointsDataFrame R object with distance constraint
- Heatmap of events, I can't get the right resolution
- How to wire switch for outside motion sensing light
- Reconstructing a non-$\omega$-categorical countable structure from its automorphism group
- Proof: if there is path independence in a vector field F on an open connected region D, then F is conservative ?
- Stirling numbers Sum
- How to prove isomorphism for rings in Galois Theory?
- Conditional Probability — Card Question
- Does the dynamics of this Fund follow a GBM?
- Is my transformation of (A∨B∨C)∧(¬A∨(¬C∧¬B) to get disjunctive NF (using distributive law) right?

# Dealing with the order of features (sequences)?

2018-06-24 10:57:59

Assume we have following sequence database that is subsequently converted with one-hot encoding:

1 2 3 4

0 A B C D

1 B A D NA

2 A D C NA

One-hot encoded:

A B C D

1 1 1 1

1 1 0 1

1 0 1 1

Actually, the real data has cases like co-occuring items:

1 2 3 4

0 A,B C D

1 B A,D NA

2 A D C NA

Problem:

When converting the sequential data through one-hot encoding, one key information is lost: The order (sequence) of items in the dataframe. Given that I like to make predictions based on the sequence of actions (A,B,C,D), I am puzzled how to solve this problem?

Or: Is an LSTM able to deal with this data?