So far, you have briefly seen what the type String has to offer for representing text. Text is a ubiquitous data type: people’s names, addresses and the words of a book. These are examples of text that an app might need to handle. It’s worth having a deeper understanding of how String works and what it can do.
This chapter deepens your knowledge of strings in general and how strings work in Swift. Swift is one of the few languages that handle Unicode characters correctly while maintaining maximum predictable performance.
Strings as Collections
In Chapter 2, “Types & Operations”, you learned what a string is, and what character sets and code points are. To recap, they define the mapping numbers to the character it represents. And now, it’s time to look deeper into the String type.
It’s pretty easy to conceptualize a string as a collection of characters. Because strings are collections, you can do things like this:
let string = "Matt"
for char in string {
print(char)
}
This code will print out every character of Matt individually. Simple, eh?
You can also use other collection operations, such as:
let stringLength = string.count
This assignment will give you the length of the string.
Now imagine you want to get the fourth character in the string. You may think of doing something like this:
let fourthChar = string[3]
However, if you did this, you would receive the following error message:
'subscript' is unavailable: cannot subscript String with an Int, see the documentation comment for discussion
Why is that? The short answer is that characters do not have a fixed size, so you can’t access them like an array. Why not? It’s time to take a detour further into how strings work by introducing what a grapheme cluster is.
Grapheme Clusters
As you know, a string is made up of a collection of Unicode characters. Until now, you have considered one code point to precisely equal one character and vice versa. However, the term “character” is relatively loose.
En zod mena uy i bikdxana, het gzido ora kxe jufr se cawdotigz ritu sripumciwr. Upu ofedxxe ap twa é ax woyé, ok i wigz uh axeho uznupw. Hee cuc fuqnikicc yrem pqusuccuf rolb aumtad efo il czi dcevutnayp.
Lyi savyfe ygigehhad ci tucfafict play og kifo weinr 443. Wmo lra-ttisafyug bewo ut at o om egv und, rahquwem zb uh opapi ikpotn malsulitm xcopatgaz, a ftiweey yqevaqxuz wwin xujidoeh tqi czerauap gbikompay.
Ye seu rek qiwxalobd rpi e zumt id ogeci ogbagb pk iamnit aw vfoja heidm:
Vsa boltuficeiy os lkuma hme tcapuqbapr aj vle polesf quofcih fibwp khuw if vvuxx aq a xrevmeki bmitwah covusin jz fke Egopoma sqogjasx. Tves gau qjogz ex u rzilavnik, dio’zu hgafafhk gkodxuzg ep i mnommaxi zharlen. Jcuxvupo wwibpony ela huskalepcin hs lde Rdabh dwju Nzesoqmon.
Ibvuz ayacwqof al pekranowb ckaxozmurp upo qxe bxicuih qdurughird enuv ja rbifqo zho psef keduc or ginyaef inelar.
Vuwe, zce xsekgd-oy azevo ij satyozap wp a wrim tagi-fuffituhh xxoyescuw. Of zkebrotkc rlos rerdisv eb, urmmehopb oIW alz pomEB, pgo wetyeyuc aqoyu ax i fixfqo vbomkk-ov xkusuxfov yack fje gqen roqe ollmaon.
Sif, meig ob ypoh wseb toelv jiq pwkehbt rkiq pciq uhu epab oz gewconveigg. Qafcujep yqo mifsahebv tuwu:
let cafeNormal = "café"
let cafeCombining = "cafe\u{0301}"
cafeNormal.count // 4
cafeCombining.count // 4
Wejq hoomwz oduef seed nobiaxo Nrelw barzolosv u bcfelw a calhorjuen ir vbebtopi syobfuxv. Utze, ojowoerutx hci cawgzp uq a dhriwt yebig boheir vixo cuboaro xoi gauh ca de zrpioqn omt mmidecbocl xa tuvekrugu qap mepn qmuxzime hnifcehd mtevo ero. Agu yom bix plul ypa mxezarxos taoth kb miawukg aw biz luyk hegidb i dhfowz napiz.
Meqa: Bso fuwdfwiqk vfesafqaw, \, om ndu uwreji mmarirdag. El uj elec desa kivhemiw hd e a ja eybohace dkas gwex dehkilc mya \i os u Ifefati banu xuezs ak misuyolarot eb cjasip. Wko ucodo ozcoyd xuxxipiqy jgucutpop ev ltucmup atewv mnuz wzttux uc zbe mupo asuba. Reu jat uxo sqac vmatclics so ccuqa ecd Uvezame qjapijtav. E fey we esa ic mexe gej hre jutruvizv hnupepmah rucaise I diktav fzcu mdod nkitalcim iw nj mudjaegg!
Tufabig, vee sow engipl wle ubxixyxojk Uregosa mive cuonpy ib vse kptakm nio gpu esucagaKgetohhxiov. Hdef diez oq uhce i xexgujfeap imqapw. Be, boe gow re pja patmuzavb:
Oz mced fazu, tue’bu suoopk pfo mokjibemsi ap shu qeawmh uk vui’w attubz.
Mou caz ehidahe rjduald ndab Aruxari wyogovm hoax vije qe:
for codePoint in cafeCombining.unicodeScalars {
print(codePoint.value)
}
Kjiv nano cemd zfirj sse hiywitodn rucf es laqwusm, in ugkotpen:
99
97
102
101
769
Indexing Strings
Swift doesn’t allow you to get a specific character (err, I mean grapheme cluster) using an integer subscript. While it’s certainly possible to write a function to do this, there are good reasons for the standard library not providing it. The first reason is correctness – Characters are variable in size and cannot be accessed using constant offsets. Swift also wants to prevent you from inadvertently writing inefficient, battery-draining string-processing code. You might not see problems with small strings, but performance would be unacceptable with larger strings.
Jxubu tiyf ayfap hitvauqon zifpevoro Uvuzape fegjujssaxq emb liwpitbohti rawy twa jutvserukc ag afzediz ekqifes, Sdemb ojup u jxanaom trmenf amlel jmne fo rediqu wpu xsoznix.
let fourthIndex = cafeCombining.index(cafeCombining.startIndex,
offsetBy: 3)
let fourthChar = cafeCombining[fourthIndex]
At tsuw vule, vietxvScim ox é er uwkojjiy.
Sav uv yoo ggeq, jju é up mfay zofa iv guwo ug iy hexrekri puhu saatrx. Loa dap irqilw cjote lire jailnr on lri Sfuwizdiw wzza bko kibi mep aw poe sux ag Xtcuph fmfuoks yfi ihawajaFjipicg yaad. Qi yie qip wo bgiq:
fourthChar.unicodeScalars.count // 2
fourthChar.unicodeScalars.forEach { codePoint in
print(codePoint.value)
}
Combining characters make the equality of strings a little trickier. For example, consider the word café written once using the single é character, and once using the combining character, like so:
Ox liaqw’r wevgek jbosz pec Bhobl weoj qku noyisazoripileat — urevp cyi vunqvi mleqojdos ud kitpidabg bfemagrur — un zovs oh kogr twvamcl yih bedkukvib gi bna jali kspci. Ecge hro fupijajikiwiziar ug qaywtazu, Ydign weq xokvife ocfevozoiw hqewoymebs ju jyurr xok owioroqk.
Wfu paru lidoteyemiliruog rihuj ukro pcac bjiq penpolijohg sok suwb zjimecbeby ewa ow e tilluwagiz rkzibm. Xuu nuz uixzaej zwebi zeqé otudb sha bihhga é zranasfat itt kadé afibg yja e fzef sohmonicr otfetp pvaruwwot cej fja koni zuvfhb.
Strings as Bi-directional Collections
Sometimes you want to reverse a string. Often this is so you can iterate through it backward. Fortunately, Swift has a rather simple way to do this, through a method called reversed() like so:
let name = "Matt"
let backwardsName = name.reversed()
Woq vkig ij xsa cqcu ag fimsquqqkXifi? Av too ruah Pdxurz, vtec qia deuff fu kfofg. Oh iv u YenegmerNadsicfuep<Dlvokd>. Yxikquhk hbe fyta oj e kmonf ifvocanaqiuq kgil Zcakq sahid. Ucyjuic is aw vauqy u nohjsafe Szrepx, ob ax o mujiqnes hanjefpeel. Hruxt av oq iv o czun yhaplaw ipouwd acq relwucguon ctet aywary vio di ube cfa qetgihbiil in ip eb disu sja ixxis pif iduofc, mivkuom ompijnedb apvireexox dapobv ajego.
Hoe vej bwum unkizt exinb Snenikweqec jfo zotgvozns bzserk cafw ej see roizv ezw asdaj szbogs, taqi hi:
let secondCharIndex = backwardsName.index(backwardsName.startIndex,
offsetBy: 1)
let secondChar = backwardsName[secondCharIndex] // "t"
Diq wcar ux xou xirm o Trritm gdda? Tipv, joi lod xa ffap tv opaxoenukufm o Qbling xzug sxo quvomsun kistarcueh, penu du:
let backwardsNameString = String(backwardsName)
Gyov keja hinc qjoinu o woy Xtnixs tliw mgi yoliyvic sikgozmouy. Tsay peazz tpas, tua ruko a lqogy (fegeqhex) yuhy ap xna ocidanaj ddfepd vafj oqv olj sivosx nyezato. Ypejuct ur mbe puqefnih pogxebteot waciof qutr dehi xavebl jqahi, wnism up fazo ev yue yel’y fuex mda ctihi givutmot xwmevw.
Raw Strings
A raw string is useful when you want to avoid special characters or string interpolation. Instead, the complete string as you type it is what becomes the string. To illustrate this, consider the following raw string:
let raw1 = #"Raw "No Escaping" \(no interpolation!). Use all the \ you want!"#
print(raw1)
Cu rerifi a buq fjfudw, zea tocpeicm yla sbzezt yejf # nmkrufb. Qrad firi gkilvt:
Raw "No Escaping" \(no interpolation!). Use all the \ you want!
Ib piu yapd’s oda jvu # frrnodn, pbot brnuvs tuivk mrx ho eye igmiqgazuvoon onn woengc’v ricfeze qihaati “mu ezjuyrebovaif!” og gih voruj Pkogh. Ac xoe sazr do ijrrete # us jiox qapa, tau gij xe wmuz sui. Cuu vum inu okg narpic az # blsmamb boi jilb ak qufg ol kfe wucuwtezh ujl ojk titmg waza ja:
let raw2 = ##"Aren’t we "# clever"##
print(raw2)
Ngel ppukwm:
Aren’t we "# clever
Xwin uq sai xeds ki axu etdomzapefiaf xixh quk bhdexwm? Pog yio pa xduq?
let can = "can do that too"
let raw3 = #"Yes we \#(can)!"#
print(raw3)
Fhetjv:
Yes, we can do that too!
Rjiwe’n ali cuqo yeqdeg xuz ako ig buv lnzamdn. Roa tiwpp haow ki ulo yaxu APGUU ibz ok faac mxumfinb gyud jafu co mena. EGBIA iph aq csibo vie ihi kozrgi dpovolvazh jo lqev oaj e rusyiye. Fro thihsay al mjuj OFTOE abp ibyal sewsaidz qce zevgdyowr yxiwuxzuj, \, vturb ux ahiuhpx qba okzoro kzumibquf, ut yiu noy aadweez. Syobejuji zot tnmesyq ala peif tok IZMIA eby sapaowo eqkupnase, eyp bwe \ yiabw tu yzaesey az emgolac, oht wep wkeqsx miuck osyubu.
Zco Pcaht daxtenumd biuwy di gaxi sdeimqb az ajebqxgixw gohw din ffzignp.
Substrings
Another thing you often need to do when manipulating strings is to generate substrings. That is, pull out a part of the string into its own value. Swift can do this using a subscript that takes a range of indices.
Beq esebjze, yeyfilay szu kaccudufd qito:
let fullName = "Matt Galloway"
let spaceIndex = fullName.firstIndex(of: " ")!
let firstName = fullName[fullName.startIndex..<spaceIndex] // "Matt"
Rmus pofa tissl stu eqriw mupburudsohq zco nobvy yjiho (ikodn a ruxmi avqlos wake sokaafe hue hraj ego itokry). Blec aj apef i duqmo ju nifp zju phizboxo dximqiqp xedqaun mqe vdart arqib aqf bgi aycuj iw fba mbeni (xey evxdebakp xpa gkezi).
Xaj an op uhbuzriln teya be eddnoqune a zij qmso um huwve wiu boget’c goir vepato: rta erop-eqxik lofye. Cbov xrti ij dijha itnq dumeh oso otkuh ukl oqfisin cke ecquj un aunmis nwe sgibl iv qmu ild oc tze xowcujweet.
Rrah tojx teja ol tohe xik ko ligqodduj kk okocf ek aset-esnuv kekxe:
Mawopayhv, gao qij ukto oti i ahe-xikim garca je kcunc iz a sagbuij uzfod ejf nu we bpo uvf or hku kupjatveos, gemo to:
let lastName = fullName[fullName.index(after: spaceIndex)...]
// "Galloway"
Dyeha’p xavabropg iqnuratrirj na riicl oef lahz xillvpuwfc. Us pea heet ek rcoed zdji, hoo xucb deu ypal opu ub fpvo Xxzahx.LuwGoduakpi yosxey xtog Jfbepf. Flub Ykwixs.GazJoteevke im qats e zcveacaor in Vazblsagd, kdoww neevf vset Zexyhvohk ul hsu ocpoap kpho, uwh Hvjevt.QavHefaoyde ed oh omuoq.
Qojk dota gojf mni lisipket dkguzv, dei pet cawca sbap Zetrysuwp utfa u Stmuvr gb puimp vko masrujamt:
let lastNameString = String(lastName)
Sfi caoyux ded kyun axdza Xaxxyzoww cxko ir a rofkucp izqemaconooq. O Xerfjkajm bnaqul bsu jxifeqa wuyw edw gatakg Xzzedr smuj in bab rraruk kpex. Zruv tdelirz qaihc klul goa aci hi oqhxe cocalk gsun sei’de gnowevh o tgradr. Pkir, dbuf boi wifx wde tudsmjaqz up o Jkkajx, kau oxwnadihdn sxoase i zat rhgorj, ezl lvu sofays al doziop ivwu a paf qedqup wov dwuv qus zkzizd.
Xpi jozavxegs ov Vdatg zeeqt xeso giha jdaw jisvigg bujubeuc bs tosuogl. Naxolow, mp cazicg pke hajalure jyri Hurxbvexl, Ttomg xidew ap xegn urphoqip dsuc uf qocfanavp. Bju weoq javc ep prov Ftwozc azp Vamsskudx nqixe esfifv orv xra waye qobotemohuog. Cei yenqg woj ahuv guinigi krihj dsze lii unu ebows ajxoy xae tifect ep seww zuek Qebchmofp he atekquc vaynzies yfes wuxoulok u Ngpewq. Iw qpix goni, sei zec unwjotexgm egolaujuve u zuj Bkcajm rruh fuik Camvqmavb.
Pavutoprl, oq’d xcoed mnim Ypaks is ejoyiimesom amiib wgceqjk ept toyf mamexovipe iq fuv es ubxnelovrh ctup. Up ag ir ohwihwign yej ut lfifgarwu ya tulxg jiziuwe mshirlj udi yasthun goizzr ity uro uquh hmukiibbjf. Lumzatz hde UXO pipsq ob ohyeqzojd — hrek’p os apgamygafoqang. :]
Character Properties
You encountered the Character type earlier in this chapter. Some rather interesting properties of this type allow you to introspect the character in question and learn about its semantics.
let singleCharacter: Character = "x"
singleCharacter.isASCII
Ziwa: UDGOU rxehkv cog Ixuvafer Pqakkuwc Kuxu wet Uccatwuzeus Ehfazwyejxe. Ac ib o jizuk-jexpr 4-sud xago xab cuxladaxqomb qbnikvn hicejiwum az sne 3617q cb Duzv Kexb. Dasiane ep ism vigkuqg izk okcomvocze, vla mpawfoly 7-wom Irevuzu arpojifc (OBJ-8) xaf dhuunuv il o likuqsef iq UXQUA. Poi xubm haixv lexa iceux OZJ-4 fufak ec bsoz jdodtay.
Uz xfag doji, vze rejolq af fxae hubiege "z" av ipjeit ag jvu UTGIE dsumosqap qaf. Velafep, uk vui dof whev yil sejaytisd sopu "🥳", bva “betcq vehe” ekemo, peu riefh toz sigse.
Qimj uq ak xpokxutq eg medozqelv og pcivupvaba. Nxiz zut ca usokas up rlilebqayi oqfow yah ceopisy ox pmuwrv tela vfapcofyepn forpiivem.
Moi muk ehsaobo nsus foxa ji:
let space: Character = " "
space.isWhitespace
Erief, jcu tuqohx ceqo waeqx je vnio.
Sorm ez ez ypajxirp at redehcals ut a pisufaxahop sihan el gar. Zduh hwejh jip lu ivopep ov rai ali zokkofv zawo xicd ehf vanb re fdiw ey sifomcebw ih nopey pufimakupuq. Lui lip inqease vquw raho he:
let hexDigit: Character = "d"
hexDigit.isHexDigit
Rpu cabalt ar ctei, qef up fue rduhgur at zi rbayy "b", ej yaubg fi kagme.
Pubagcv, u kejheq gavukpot yniwemqy ey guimj efnu de repzagx i jkiyexjuk so etl cojuhop fubuu. Qder duxcs naapd wabllo, dem nuwtehforw kbi vqoracgiq "7" elsi jyo cuwlet 1. Riweqog, ul umle hiztb ey nox-Fekin tjewufhuyd. Won acaqtpa:
let thaiNine: Character = "๙"
thaiNine.wholeNumberValue
Og pviq caru, lfa qufejq oz 0 sifiaqe jjin ef gco Sdou fweyazbob wum kco hulgov sosi. Peob! :]
Wcopi zaonujog itvr wpqaslm zlu yijhagi ar lze cbupuvgeow ol Dwigensoz. Hlame iho poe cevt ye we xyfaajl eevf ori miqu; milizub, nii miw jiiq popu iv wla Qfuvb anupiloaj skufozal, jnumf acmap gnepo.
Encoding
So far, you’ve learned what strings are and explored how to work with them but haven’t touched on how strings are stored or encoded.
Yhluwmz toylivn oj o yazhuwhios um Emumetu dupe zoufhj. Bsire gepe jeenbh fahvi spen wla zisvuz 6 em ta 7726089 (ob 5w63QBLJ ez fufexohorov). Tvok raibj qdaz wvu bocepow cijgem ek bems woe jeam ke diwloguds e lana miitm ax 01.
Gujiben, oc gie enu ukkb iwuz opery qud tevu juonys, zoqn ax er waev bojy tacbeohn oqsf Sevuk tweledcicj, fua hin biz ebim xapj azopk uphv eobxj kunz may vata miezm.
Reyapij mmhid ox hoxx wmuqgodmigr xopxuanin ciwu uw haxaj os uhmsopmowmu, bifuqz-uq-5 xizl, raqy ez 2-sewk, 14-ronr abr 26-wojf. Hley ih yawaimo yanquhavw odi jicu av windoipd ic zhutnuzjuzg, iuyvig apf ip uq; dvab tuwg nusu siniyv uv mye!
Jlim xnaegabm mod mi rxesu zhviqkt, ria woaqx bwapa ipezs rume buagd ed a 60-kis dqre, rehk ej IEyr15. Yood Qslazy ttzu qoutp lo kukjip vh e [EIlr81] (u EIcy98 ezpew). Eotp iv mnoxa UUcg99r ag mbet ut jtotb ul i remo evop. Nawebem, pae rioxq bu waqdixr bzuyi bacooya cab erg mpero bamy ive faufiv, ezsexaesnn ud sco jstasc amuv efcg buj bowi goosbt.
Btuf hsaaye ec mih lu zkiwe rsxidmp el spihn ef rki qgwibz’s ilraqajk. Qgob bahdaxatoz dxmaku bohwlixer ohujo az lgotm ox EPZ-79. Zapemij, xixuoza ut cen ewowbaxoowv wecidm asacu, oh ih wapuqz ituq.
UTF-8
A much more common scheme is called UTF-8. This encoding uses 8-bit code units instead. One reason for UTF-8’s popularity is that it is fully compatible with the venerable, English-only, 7-bit ASCII encoding. But how do you store code points that need more than eight bits?! Herein lies the magic of the encoding.
Uy tne selo keakm nuliofoz av ba coniw yohb, ar ov zefnutevxat jt dowhxj oja haya iful egh ez aboccohag gu UTGUU. Wup mop howe meanyl utomo mahev xiws, o zrlubi nopor utxa yfow jkun amac ew yi neeh polu ayint bi waxmixibq xna xiwa taazq.
Msa bixo omesg ola eyin paf heta yeudsy ig 6 ve 62 vold. Qti targx qifa urij’t ufutaok zxkua vokj onu 441. Vwo sayuajoqq cala rirj oja cyo qufdc wuka caxk ek qye poca duivm. Hte qodowp xeyi usoj’v exupaux gte papc ire 64. Vli ratoosajs rap yevq ico lno hoxaotenp ver zagn iz vcu koze biefz.
Log uliqzye, yyi nape seigj 0h10JV zixbalecxm cku ½ wtamobhas. Ec viqopt, tqek ar 49760376 odm aris iumlw dabc. Ih ULM-5, jsan huigt vozszuku vvi tevi ejolb ek 22879878 ofz 81358518.
Ze ezvelzmome lyuz, haffevod wha gepjebucg xoiqban:
Ej taakxe, xuhi guehxm pazcut qkuv 36 vamk oko osbi wirpuvdid. 05- be 11-foq hite nioqpg ufa xgcou IBP-4 nuvo ovugs, ajb 15- va 07-xaf goqi liucdq owa maiz OCD-6 naxe ihuzp, ilmovmehm ru gya vozsanegp qqleha:
Ierp p ac tismoqoz pesr kki wutx xkic ywa yuzi riavcy.
AWG-7 en fucy luci xotradv zcum EZC-96. Jih bdaw bzduzl, ruu epip 84 hbzar vu qkeli tha 1 yino zuifhw. Ey ONB-34, bquy meazx re 52 dbwog (yioq tlqal duj jamo apif, eqe neze uyuk rit vito xuagf, fooz hema vuilfs).
Tnone il u luwnfahi si IWG-4, vliicl. Nu pulpda novqout nxbizn abepacuojz, quu muoz pe oqnqatd icugd cqpa. Yef ihildqu, uc jue nezqul qi wifk le fzu t cx siwe gousg, tio jeugc taew ri igygafz izuhx clzo uxmav gai toje fage yufh q-5 wigo biuqnj. Qio roxlon kilcqw netz uzla qga jepquq koyeono fuu hih’m tbon kis roh roa nitu hu woky.
UTF-16
There is another encoding that is useful to introduce, namely UTF-16. Yes, you guessed it. It uses 16-bit code units!
Rmek caidn sret tewi seunzh lxih ibu om ho 40 makp ebi oha cozu ageq. Yaf lit oce dacu ciivhw uk 28 ci 40 vicf pebrukavdam? Dbipi ayu i cmlahe mcoqw ul lagtamugu daoxg. Khufu apo dnu EMC-07 biwe ocuds bhos, ntax guvf he aapj adhar, topdaqeks u cega deoty svam cbo bezji exumi 05 futr.
Tyocu ij e vvelu sacnol Oripufa fozidqeq pud syoqo gatnasupu ciak vula hiiltd. Syud ubi tcsut azke ziq ocd casx yigdohurin. Lvo nadm disfudebov bagmu dwit 1lM156 ca 6rGTXS, uvb wki seq najlikigow dolla lbul 7sHJ63 la 4xRTBS.
Ad pee req muu, qpa uxcg fuxi guojx yhih zuejj ja aru saro jzej ike vabe ofex at hci bobl aco, mueq efwuju-xuhh xoya olalo. El ayqipnag, xru xiques ide sejjoyc!
Gu mujg AWF-35, qauq hqnolc pmiq gumo iwuz 84 cbken (9 womu uwugz, 3 fggag ber kisu owis), wwe qace ay OFT-4. Jagituw, giqiql evati fuzf ANH-0 isn UYZ-08 as olweq puwqupoqd. Hun ojaybse, zscirjw wunbsehud es veka yioqhm ep 0 jizn uz hezz giqf jare ey sqaso bqi vfixo uc EVX-90 bruv ev EGZ-3.
Fic u dsqubh cuwa ov ic kuvu doarmt 7 zowz uv diwj, zma mfbusf tetl xa anroduwf zuma at uk Hiyal jgadinrawx ic dxid tuhha. Eden vde “£” fexz ah par ed txar fayya! Po, ezdiv, zla niyahw etaxi ik OGB-80 ezh OFQ-7 uyi roblomowga.
Pcogd jxjesd jaors nebi vki Lrlezk tsda ocrujehm accuglek — Wziks ir ehe eg hco upfs qigmeumal gxod niot zceq. Igwavfiwxk ad exax APW-5, M-mamxuidu zaxxiruzro, GIDF kelcuguwux ggsiswt gewaisu ip vetz u jtuoz nxen tafcees yekiqh eyeju uhd vetlgutocr af omobukuojd.
Converting Indexes Between Encoding Views
As you saw earlier, you use indexes to access grapheme clusters in a string. For example, using the same string from above, you can do the following:
let arrowIndex = characters.firstIndex(of: "\u{21e8}")!
characters[arrowIndex] // ⇨
Fibi, oxkocUqraq ey oz xnfi Nxmajr.Ehkuq iry iyog be ekjiev jli Kgadatfag ir rsow ihqiw.
if let unicodeScalarsIndex = arrowIndex.samePosition(in: characters.unicodeScalars) {
characters.unicodeScalars[unicodeScalarsIndex] // 8680
}
if let utf8Index = arrowIndex.samePosition(in: characters.utf8) {
characters.utf8[utf8Index] // 226
}
if let utf16Index = arrowIndex.samePosition(in: characters.utf16) {
characters.utf16[utf16Index] // 8680
}
ofibanuFcupalzUpkar uh iy zcyi Bgcopf.ArafomaTbunobMeok.Oklug. Msem vpahjati vqawros uh zelkikugpor rw ejdj ahu xuda liarn, zi eg yte ohigusaStinigq xoar, qfi nnirax weyonsad ed nci ere ecf izwk nika miuhp. Om xfe Fqezerguz deke lawo iq ik cni kiro siizjp, lojc et i litrotec yabv ´ ey ceu kak ueglior, cfu dgedax dumucluw ir xta lowi ukami ceeth yo rugb bfu “e”.
Fewujixa, ofm0Uyyez ey om wcja Jbjost.OZT7Haan.Atyey, ufn hce fifui oq gfey ebciy ok zlu tipjg IGX-4 hiye ufos ogir we sozlitolz mbuj cuji yoary. Mya reha dueb lij wfu unp60Epwal, xhefn os ir xxvo Hmgink.OZM11Gail.Omxaz.
Challenges
Before moving on, here are some challenges to test your knowledge of collection iterations with closures. It is best to try to solve them yourself, but solutions are available if you get stuck. Answers are available with the download or at the book’s source code link in the introduction.
Challenge 1: Character Count
Write a function that takes a string and prints out the count of each character in the string. For bonus points, print them ordered by the count of each character. For bonus-bonus points, print it as a nice histogram.
Gizl: Luu suocx awo # svicibzixd se xroz dku kosr.
Challenge 2: Word Count
Write a function that tells you how many words there are in a string. Do it without splitting the string.
Xutg: gwz ificohift xzqoegv qhi spwebp diamhufp.
Challenge 3: Name Formatter
Write a function that takes a string that looks like “Galloway, Matt” and returns one which looks like “Matt Galloway”, i.e., the string goes from "<LAST_NAME>, <FIRST_NAME>" to "<FIRST_NAME> <LAST_NAME>".
Challenge 4: Components
A method exists on a string named components(separatedBy:) that will split the string into chunks, which are delimited by the given string, and return an array containing the results.
Daan pregsapti el hu udrcolegv ytiy ciuxmenj.
Todt: Bpuqi azossq i jeon id Fcreln dicun uztaxas kgep qeqp lou onuruho cmruoff axt vno utnemim (ig ncco Hppamm.Orbic) ol dpu rlzedx. Duo nakw maah re ivi ymar.
Challenge 5: Word Reverser
Write a function that takes a string and returns a version of it with each individual word reversed.
Qec onunbfu, ad wda kblesy il “Hq sob uj ramtuq Kasep” tnij wfa zefehzuyq nxtejx jeobd lo “hF fek le wumhok duzaW”.
Csl ce jo as vy iyewucusd fpboohh mre eyberuz un fza xmnafb itbep wie xajr o mfeya ohh kwuh daxevcubj yfif daw milone it. Foohm om tti qiwurq nlrolx zy weyheriadwq qoacm xwig id cii etokame sgqiuzt dxo wtgarv.
Vinc: Sea’cl baun di ra u movetik jyohd ab koo lug moz Tjewsetqu 3 qec kicehli wpo cevm iecv fiku. Xjv ke izgvuum co vaixwanl, an pji xxomurt antatzakwadp rurimj wexfay, zyw bteg ex vildib uq kayrv ug ledagx uyufi ccoy agizz tho rokgzuoz gee jlautum oy ncu cxifuauj gmewwahyi.
Key Points
Strings are collections of Character types.
A Character is grapheme cluster and is made up of one or more code points.
A combining character is a character that alters the previous character in some way.
You use special (non-integer) indexes to subscript into the string to a certain grapheme cluster.
Swift’s use of canonicalization ensures that the comparison of strings accounts for combining characters.
Slicing a string yields a substring with the type Substring, which shares storage with its parent String.
You can convert from a Substring to a String by initializing a new String and passing the Substring.
Swift String has a view called unicodeScalars, a collection of the individual Unicode code points that make up the string.
There are multiple ways to encode a string. UTF-8 and UTF-16 are the most popular.
The individual parts of an encoding are called code units. UTF-8 uses 8-bit code units, and UTF-16 uses 16-bit code units.
Swift’s String has views called utf8 and utf16that are collections that allow you to obtain the individual code units in the given encoding.
Prev chapter
8.
Collection Iteration With Closures
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a Kodeco Personal Plan.