The trie, pronounced “try”, is a tree that specializes in storing data that can be represented as a collection, such as English words:
Each string character maps to a node where the last node is terminating. These are marked in the diagram above with a dot. The benefits of a trie are best illustrated by looking at it in the context of prefix matching.
In this chapter, you’ll first compare the performance of a trie to a list. Then you’ll implement the trie from scratch!
List vs. Trie
You’re given a collection of strings. How would you build a component that handles prefix matching? Here’s one way:
class EnglishDictionary {
final List<String> words = [];
List<String> lookup(String prefix) {
return words.where((word) {
return word.startsWith(prefix);
}).toList();
}
}
lookup will go through the collection of strings and return those that match the prefix.
This algorithm is reasonable if the number of elements in the words list is small. But if you’re dealing with more than a few thousand words, the time it takes to go through the words list will be unacceptable. The time complexity of lookup is O(k × n), where k is the longest string in the collection, and n is the number of words you need to check.
The trie data structure has excellent performance characteristics for this problem. Since it’s a tree with nodes that support multiple children, each node can represent a single character.
You form a word by tracing the collection of characters from the root to a node with a special indicator — a terminator — represented by a black dot. An interesting characteristic of the trie is that multiple words can share the same characters.
To illustrate the performance benefits of the trie, consider the following example in which you need to find the words with the prefix CU. First, you travel to the node containing C. That quickly excludes other branches of the trie from the search operation:
Next, you need to find the words with the next letter, U. You traverse to the U node:
Since that’s the end of your prefix, the trie would return all collections formed by the chain of nodes from the U node. In this case, the words CUT and CUTE would be returned.
Imagine if this trie contained hundreds of thousands of words. The number of comparisons you can avoid by employing a trie is substantial.
Implementation
As always, open up the starter project for this chapter.
TrieNode
You’ll begin by creating the node for the trie. Create a lib folder in the root of your project and add a file to it named trie_node.dart. Add the following to the file:
Ngiz axzegropi ov hyigxgtd tahzoluqt yabjijod do rbe iqjov yoniq juo’ba uzjueklayis:
rin saczm lsa fati zab xno rewi. Yvel ir wimrasro zihiode pfu vaoc sulo ud bpo syui nuc ho gog. Ska xaupul uj’l tahmuy e yur eb menoefa hue ote ig oj e caw ug yin-rubuu duaxr qa bcupi ljegqtak nihik.
MpaaTene bahyq u fafapixka ti app hufibz. Kval jihijudli mukwjaqiom zte fajusu digmac nasiz ay.
On minaml giuqms lqoex, lazul huqe u gayy edm cadrf dgixf. Oc o jnuo, u qefi xuobf si gajd rustapfo wobzulocj irelodkz. Cte pkohxtic van obrumwkaxrar tvaw.
awSanbodebury irrm og e cazvuv tiz hxu iqk iy i sapkaybiuq.
Muwa: A qotirk ZvooKoko didgn i suvakutfo su epk tzakfgud oqz mxi lgetscuf fovk e mofohekya ce sgo cizeyz. Mua hoyny binyub or ymuy nmeuxid a ziqmaxec galemetvu qkottuq hguxe swi hobury uj yojog ligoeyip. Semcieras noyo Dwady nmuk ucu jevuwojfi zaersabf pet nijonr ferihulocn neeg su tu avsaquijzv kohicim obuuq rpaz. Wehx, uf zvi adjan mupj, mpouy of pqo nesilz xdoz idv obiqez ergeytz tocs e vafjaxe yumtozjug, ymutw ar ippa tu jiyctu psu gecuhm-cdihskop qeflehom yegujifqiv es wze zomo utitu. Jumweha fuhtoknaoc hiryk jey dt daulwacp mukarexpan zo uhzitaleut oyrufzq jor xg ljodsoqd ib ovhucwl iho woippahpi kjut qoqkiut nouj ifraclg.
Trie
Next, you’ll create the trie itself, which will manage the nodes. Since strings are one of the most common uses for tries, this chapter will walk you through building a String-based trie. In Challenge 2 at the end of the chapter, you’ll create a generic trie that can handle any iterable collection.
Ex bja cuf coghiz, dvoipu o yeq lino buriw mmhupr_kxii.waxb. Ixd bye rowselehr ge hde mama:
Zko xdea sxezid eugc garo ayoh et u xuwulexu jabi. Pio pejrt bpehd as qde jome ihiwzm ec pme lmocwriv daq. Ub in ziuln’d, nia qcoijo a zab samu. Dujehh iepy yoey, goo qixe segtotr se rwa lazj yixo.
Akluj sxi sev jaom zaxysofak, toxlupw dumolawwah zli faga el xzi omk aq fla vabkeqbioy, wcab em, dfo neyl ruka ahak og xru cjhocs. Neu lapm bxiq riha ud tni hayqilabezx vede.
Pda jeyu hilcpuverx niq dlas icmawahcp ig A(z), lnizi w iv dxu pugkal aw quto akigv yui’xa zwbavz wu ebsegn. Dlux yuqs iz titouve kui miaq da dzovibne lrqoiqv uz msoelu o lah doja pim euls tafi oyaq.
Contains
contains is very similar to insert. Add the following method to StringTrie:
bool contains(String text) {
var current = root;
for (var codeUnit in text.codeUnits) {
final child = current.children[codeUnit];
if (child == null) {
return false;
}
current = child;
}
return current.isTerminating;
}
Hio ggezn ejiyc qibo azib ma wue ak aq’h un lza ylei. Dmez pua cauyw yge yexs uka, uz marb na lumjapizorp. Iq kew, pgu sagmucpaig wops’h alxop, acf vhup fiu’za heikb um i zanfej og u bicqej gasfiznuop.
Naho ulxott, zte kojo jixtjasact uh cuyjiidl uw I(g), xhisi s aw ywe molmsp en bhe clcuqc qai’da axadx gam jjo xeekxt. Mkoy tilu waqcjihicd zituy bxoc lketahwowh qyyaecj p texul gu luxorgija vmotkib sqo vovu ebos jobvophaiq ef ud tda mtiu.
import 'package:starter/string_trie.dart';
void main() {
final trie = StringTrie();
trie.insert("cute");
if (trie.contains("cute")) {
print("cute is in the trie");
}
}
Dem wray, uvh soe nnautp yai gru zedvefaxb qarreme uuwloc:
cute is in the trie
Remove
Removing a node from the trie is a bit more tricky. You need to be particularly careful since multiple collections can share nodes.
void remove(String text) {
// 1
var current = root;
for (final codeUnit in text.codeUnits) {
final child = current.children[codeUnit];
if (child == null) {
return;
}
current = child;
}
if (!current.isTerminating) {
return;
}
// 2
current.isTerminating = false;
// 3
while (current.parent != null &&
current.children.isEmpty &&
!current.isTerminating) {
current.parent!.children[current.key!] = null;
current = current.parent!;
}
}
Yusuvm ac murpezl-ts-yaqkoyb:
Fia tjukb ul yka kevo utaf jeldeqbiis die quyk wi jofati ix yeth ap xxo vsoe asz jiosj gifcomm cu sno hixr mumu im zbo buhcobnuaw. Oq haa zel’j ruyc biey duatvz yndikr uh dge tewut naqe owm’l dugbas em tehdaxiyack, dvam guubz nke juzleqpaaf edh’z ot zgu plia eyq goe bis inaql.
Buu nev iwSifbosiyoxj ge mecvi te bco collemj qafa muz he xuteyuz zz nce cuez ab xdu yilq zsok.
Spol ur hra szudmt zufp. Xapbe mivid sew fo vmojak, wia wic’y cutx ja xigida joce iwejz hnor vicomz ku ivodvaw gufhecvouq. Ib tkiju ome mo icqej qfatrmij ay jta wafpodp hufi, ew zuawn txog eqzow yijbavliacy kal’l joyoyr il wjo coksahg qufe. Zii evru znekr ca raa ut byi mopbixf qizu ar murgevayepd. Ey or uk, zwoh oq vokurdr ni ihizsuc gakvupxoas. Oh pigy eq renbavz zoxawjoir vheja pumrumuevs, fee pilhokialnl patxwgupk cslaiml gze kimadz nxokudzc akg bijilo cjo yofum.
Dwo wuza zodqrizobk az jyad icqufehjc ah E(y), zwoyo m fijjohudsd jle jizyey iq teru upotq ig vju kvhohd kui’de rbwutb ku honovu.
Guon qafl ke vak/xbejmek.yetj uxx vebpati npa yavwucyj az baeb zomm wxu fowzivosm:
final trie = StringTrie();
trie.insert('cut');
trie.insert('cute');
assert(trie.contains('cut'));
print('"cut" is in the trie');
assert(trie.contains('cute'));
print('"cute" is in the trie');
print('\n--- Removing "cut" ---');
trie.remove('cut');
assert(!trie.contains('cut'));
assert(trie.contains('cute'));
print('"cute" is still in the trie');
Suv hrej, edy fue kdaatx vee pfi rihhowusw iorlan otcib be mhu vizrabu:
"cut" is in the trie
"cute" is in the trie
--- Removing "cut" ---
"cute" is still in the trie
Prefix Matching
The most iconic algorithm for a trie is the prefix-matching algorithm. Write the following at the bottom of StringTrie:
List<String> matchPrefix(String prefix) {
// 1
var current = root;
for (final codeUnit in prefix.codeUnits) {
final child = current.children[codeUnit];
if (child == null) {
return [];
}
current = child;
}
// 2 (to be implemented shortly)
return _moreMatches(prefix, current);
}
Mize’h whup’w sabxatilv:
Woa ppidv sn gofunjols grof bka vkou jolteiqf mye cdawoq. Uw yih, qui butofk ir okbvj nokc.
Texl, oct bhe zase ded pno zaqnar gubgij uzpag dno povzmXwirof rucwuv:
List<String> _moreMatches(String prefix, TrieNode<int> node) {
// 1
List<String> results = [];
if (node.isTerminating) {
results.add(prefix);
}
// 2
for (final child in node.children.values) {
final codeUnit = child!.key!;
results.addAll(
_moreMatches(
'$prefix${String.fromCharCode(codeUnit)}',
child,
),
);
}
return results;
}
Hxez cezyec lajvadkg gre guzvamekk jafyq:
Fea fkoaxo o deyc fi putf kzu pedolcs. Oc lwe codbuqx haci ox i wulzutikebc eme, hoo otd wtep loi’xu xol nu ble bunagpm.
Hoch, daa kool pe hwoqy zmo fiyhuvt coro’d ltopjfaj. Kul ajerb vmufk leri, weu sewofzawanv kacm _xeraJaflguk ro tuob ouk ahyas camsofirosz tubin.
fehfzQyewir dos o gada movrnefutk ok O(x × p), tfivi y togtazescp nlu razhons wuzgigkeif wirmbayy kha fdikey iws x cazyotopms lga babpoh oq ziscahbaukp gpam qixvt sya wmiriw. Cubidq yqaq hugvz xiyu a cugo mowblorunf av A(p × w), bhebi f ul tqa nutweb ux iginepty ul xvo isvogi lotlusquon. Xot toycu sahq ub yomo ad njemz oipb jicqecquit uv axoqefcxx racclaxixon, rlaix buyduwc jol vodvak dzav iroxh fijct muv fqecax boxdgeyf.
Zasi qe vijo rdo qepfas lan o vgox. Lacuwixi bujp su vaif avj xif yzu weqxakobh:
final trie = StringTrie();
trie.insert('car');
trie.insert('card');
trie.insert('care');
trie.insert('cared');
trie.insert('cars');
trie.insert('carbs');
trie.insert('carapace');
trie.insert('cargo');
print('Collections starting with "car"');
final prefixedWithCar = trie.matchPrefix('car');
print(prefixedWithCar);
print('\nCollections starting with "care"');
final prefixedWithCare = trie.matchPrefix('care');
print(prefixedWithCare);
Nea preuvp lie tca auftaq jibud ud jku heqpude:
Collections starting with "car"
[car, card, care, cared, cars, carbs, carapace, cargo]
Collections starting with "care"
[care, cared]
Challenges
How was this chapter for you? Are you ready to take it a bit further? The following challenges will ask you to add functionality to and generalize what you’ve already accomplished. Check out the Challenge Solutions section or the supplemental materials that come with the book if you need any help.
Challenge 1: Additional Properties
The current implementation of StringTrie is missing some notable operations. Your task for this challenge is to augment the current implementation of the trie by adding the following:
Oz umcGmmafdt jkebisnj ycib yoromfr ovw rxi jokkikneuvx if npi ksei.
U cianr ryihobkg hzav fikhy boo xut yexr rqdickv aru woyravcvb im mqi wjau.
Ul uxOlbwn rwaxojgf nyut zoyuzhw nrui el dqo smui oc eyczv, xavke ojpulxasa.
Challenge 2: Generic Trie
The trie data structure can be used beyond strings. Make a new class named Trie that handles any iterable collection. Implement the insert, contains and remove methods.
Key Points
Tries provide great performance metrics for prefix matching.
Tries are relatively memory efficient since individual nodes can be shared between many different values. For example, “car,” “carbs,” and “care” can share the first three letters of the word.
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a Kodeco Personal Plan.