Usage
merge_segments(dtm, min_segment_size = 10, doc_id = NULL)
Arguments
- dtm
dtm of segments
- min_segment_size
minimum number of forms by segment
- doc_id
character name of a dtm docvar which identifies source documents.
Value
the original dtm with a new rainette_uc_id docvar.
Details
If min_segment_size == 0, no segments are merged together.
If min_segment_size > 0 then doc_id must be provided
unless the corpus comes from split_segments, in this case
segment_source is used by default.