Module:String2
Template:Module rating Template:Lmd
The module String2 contains a number of string manipulation functions that are much less commonly used than those in Module:String. Because Module:String is cascade-protected (some of its functions are used on the Main Page), it cannot be edited or maintained by template editors, only by admins. While it is true that string-handling functions rarely need maintenance, it is useful to allow that by template editors where possible, so this module may be used by template editors to develop novel functionality.
The module contains three case-related calls that convert strings to first letter uppercase, sentence case or title case and two calls that are useful for working with substrings. There are other utility calls that strip leading zeros from padded numbers and transform text so that it is not interpreted as wikitext, and several other calls that solve specific problems for template developers such as finding the position of a piece of text on a given page.
The functions are designed with the possibility of working with text returned from Wikidata in mind. However, a call to Wikidata may return empty, so the functions should generally fail gracefully if supplied with a missing or blank input parameter, rather than throwing an error.
Functions
trim
The trim function simply trims whitespace characters from the start and end of the string.
title
The title function capitalises the first letter of each word in the text, apart from a number of short words recommended by The U.S. Government Printing Office Style Manual: Template:Xt.
sentence
The sentence function finds the first letter and capitalises it, then renders the rest of the text in lower case. It works properly with text containing wiki-markup. Compare {{#invoke:String2|sentence|[[action game]]}}
-> Action game with {{ucfirst:{{lc:[[action game]]}}}}
-> action game. Piped wiki-links are handled as well:
{{#invoke:String2|sentence|[[trimix (breathing gas)|trimix]]}}
-> Trimix
So are lists:
{{#invoke:String2 |sentence |{{hlist ||[[apples]] |[[pears]] |[[oranges]]}}}}
→ Template:hlist
ucfirst
The ucfirst function is similar to sentence; it renders the first alphabetical character in upper case, but leaves the capitalisation of the rest of the text unaltered. This is useful if the text contains proper nouns, but it will not regularise sentences that are ALLCAPS, for example. It also works with text containing piped wiki-links and with html lists.
findlast
- Function findlast finds the last item in a list.
- The first unnamed parameter is the list. The list is trimmed of leading and trailing whitespace
- The second, optional unnamed parameter is the list separator (default = comma space). The separator is not trimmed of leading and trailing whitespace (so that leading or trailing spaces can be used).
- It returns the whole list if the separator is not found.
One potential issue is that using Lua special pattern characters (^$()%.[]*+-?
) as the separator will probably cause problems.
Case | Wikitext | Output |
---|---|---|
Normal usage | {{#invoke:String2 |findlast | 5, 932, 992,532, 6,074,702, 6,145,291}} |
6,145,291 |
Space as separator | {{#invoke:String2 |findlast | 5 932 992,532 6,074,702 6,145,291 }} |
5 932 992,532 6,074,702 6,145,291 |
One item list | {{#invoke:String2 |findlast | 6,074,702 }} |
6,074,702 |
Separator not found | {{#invoke:String2 |findlast | 5, 932, 992,532, 6,074,702, 6,145,291 |;}} |
5, 932, 992,532, 6,074,702, 6,145,291 |
List missing | {{#invoke:String2 |findlast |}} |
posnq
- posnq (position, no quotes) returns the numerical start position of the first occurrence of one piece of text ("target") inside another ("source"). UTC characters are supported.
- It returns nil by default if no match is found, or if either parameter is blank. If no match is found it can return the value of an optional "nomatch" parameter.
- It takes the text to be searched in as the first unnamed parameter (or Template:Para), which is trimmed.
- It takes the text to match as the second unnamed parameter (or Template:Para), which is trimmed and any double quotes " are stripped out. That allows spaces at the beginning or end of the match string to be included in a consistent manner.
- It can take an optional third unnamed parameter (or Template:Para), which is trimmed. If it's set to false, then the search accepts Lua pattern-matching for the target, otherwise a plain search is used.
- It can take an optional fourth unnamed parameter (or Template:Para), which is trimmed. This value is returned if no match occurs. Setting Template:Para makes the output compatible with the find function in Module:String.
- Examples
{{#invoke:String2 |posnq |This is a piece of text to be searched |ext}}
→ 21{{#invoke:String2 |posnq |This is a piece of text to be searched |ent}}
→{{#invoke:String2 |posnq |This is a piece of text to be searched |" pie"}}
→ 10{{#invoke:String2 |posnq |This is a piece of text to be searched |" ece"}}
→{{#invoke:String2 |posnq |source=This is a piece of text |target=ece}}
→ 13{{#invoke:String2 |posnq |source=This is a piece of text |target=%s |plain=true}}
→{{#invoke:String2 |posnq |source=This is a piece of text |target=%s |plain=false}}
→ 5{{#invoke:String2 |posnq |source=This is a piece of text |target=ece |nomatch=0}}
→ 13{{#invoke:String2 |posnq |source=This is a piece of text |target=xyz |nomatch=0}}
→ 0{{#invoke:String2 |posnq |This is a piece of text |" of" |true |0}}
→ 16{{#invoke:String2 |posnq |This is a piece of text |" of" |true |0}}
→ 0{{#invoke:String2 |posnq |source=Meet at Café Nero |target=afé}}
→ 10
split
The split function splits text at boundaries specified by separator and returns the chunk for the index idx (starting at 1). It can use positional parameters or named parameters (but these should not be mixed):
{{#invoke:String2 |split |text |separator |index |true/false}}
{{#invoke:String2 |split |txt=text |sep=separator |idx=index |plain=true/false}}
Any double quotes (") in the separator parameter are stripped out, which allows spaces and wikitext like ["[
to be passed. Use {{!}}
for the pipe character |
.
If the optional plain parameter is set to false / no / 0
then separator is treated as a Lua pattern. The default is plain=true, i.e. normal text matching.
The index parameter is optional; it defaults to the first chunk of text.
The Template:Stringsplit is a convenience wrapper for the split function.
stripZeros
The stripZeros functions finds the first number in a string of text and strips leading zeros, but retains a zero which is followed by a decimal point. For example: "0940" -> "940"; "Year: 0023" -> "Year: 23"; "00.12" -> "0.12"
nowiki
The nowiki function ensures that a string of text is treated by the MediaWiki software as just a string, not code. It trims leading and trailing whitespace.
val2percent
The val2percent functions scans through a string, passed as either the first unnamed parameter or |txt=, and converts each number it finds into a percentage, then returns the resulting string.
one2a
The one2a function scans through a string, passed as either the first unnamed parameter or |txt=, and converts each occurrence of 'one ' into either 'a ' or 'an ', then returns the resultant string.
The Template:One2a is a convenience wrapper for the one2a function.
findpagetext
The findpagetext function returns the position of a piece of text in the wikitext source of a page. It takes up to four parameters:
- First positional parameter or |text is the text to be searched for.
- Optional parameter |title is the page title, defaults to the current page.
- Optional parameter |plain is either true for a plain search (default), or false for a Lua pattern search.
- Optional parameter |nomatch is the value returned when no match is found; default is nothing.
- Examples
{{#invoke:String2 |findpagetext |text=Youghiogheny}}
→{{#invoke:String2 |findpagetext |text=Youghiogheny |nomatch=not found}}
→ not found{{#invoke:String2 |findpagetext |text=Youghiogheny |title=Boston Bridge |nomatch=not found}}
→ Lua error: bad argument #1 to 'find' (string expected, got nil).{{#invoke:String2 |findpagetext |text=river |title=Boston Bridge |nomatch=not found}}
→ Lua error: bad argument #1 to 'find' (string expected, got nil).{{#invoke:String2 |findpagetext |text=[Rr]iver |title=Boston Bridge |plain=false |nomatch=not found}}
→ Lua error: bad argument #1 to 'find' (string expected, got nil).{{#invoke:String2 |findpagetext |text=%[%[ |title=Boston Bridge |plain=f |nomatch=not found}}
→ Lua error: bad argument #1 to 'find' (string expected, got nil).{{#invoke:String2 |findpagetext |text=%{%{[Cc]oord |title=Boston Bridge |plain=f |nomatch=not found}}
→ Lua error: bad argument #1 to 'find' (string expected, got nil).
The search is case-sensitive, so Lua pattern matching is needed to find river
or River
. The last example finds {{coord
and {{Coord
. The penultimate example finds a wiki-link.
The Template:Findpagetext is a convenience wrapper for this function.
strip
The strip function strips the first positional parameter of the characters or pattern supplied in the second positional parameter.
matchAny
The matchAny function returns the index of the first positional parameter to match the source parameter. If the plain parameter is set to false (default true) then the search strings are Lua patterns. This can usefully be put in a switch statement to pick a switch case based on which pattern a string matches.
Template:((#invoke:String2|matchAny|123|abc|source=abc 124}}
returns 2.
Deleted functions
upper
,lower
: use Template:Magic word or Template:Magic word which handle strip markers correctly.label
: alias ofucfirst
.
Usage
{{#invoke:String2 | sentence |…}}
- Capitalizes the first character and shifts the rest to lowercase- Although similar to magic words'
{{ucfirst:}}
function, this call works even with piped wiki-links because it searches beyond leading brackets and other non-alphanumeric characters. - It now also recognises when it has an html list passed to it and capitalises the first alphabetic letter beyond the list item markup (Template:Tag) and any piped links that may be there.
- Although similar to magic words'
{{#invoke:String2 | ucfirst |…}}
- Capitalizes the first alphabetic character and leaves the rest unaltered- Works with piped wiki-links and html lists
{{#invoke:String2 | title |…}}
- Capitalizes all words, except fora
,an
,the
,at
,by
,for
,in
,of
,on
,to
,up
,and
,as
,but
,or
, andnor
.{{#invoke:String2 | stripZeros |…}}
- Removes leading padding zeros from the first number it finds in the string{{#invoke:String2 | title |…}}
- Renders the string as plain text without wikicode
Parameters
These functions take one unnamed parameter comprising (or invoking as a string) the text to be manipulated:
- title
- sentence
- ucfirst
Examples
Input | Output |
---|---|
{{#invoke:String2| ucfirst | abcd }} | Abcd |
{{#invoke:String2| ucfirst | abCD }} | AbCD |
{{#invoke:String2| ucfirst | ABcd }} | ABcd |
{{#invoke:String2| ucfirst | ABCD }} | ABCD |
{{#invoke:String2| ucfirst | 123abcd }} | 123Abcd |
{{#invoke:String2| ucfirst | }} | |
{{#invoke:String2| ucfirst | human X chromosome }} | Human X chromosome |
{{#invoke:String2| sentence | abcd }} | Abcd |
{{#invoke:String2| sentence | abCD }} | Abcd |
{{#invoke:String2| sentence | ABcd }} | Abcd |
{{#invoke:String2| sentence | ABCD }} | Abcd |
{{#invoke:String2| sentence | [[action game]] }} | Action game |
{{#invoke:String2| sentence | [[trimix (breathing gas)|trimix]] }} | Trimix |
{{#invoke:String2 | sentence | {{#invoke:WikidataIB |getValue | P136 |fetchwikidata=ALL |onlysourced=no |qid=Q1396889}} }} |
|
{{#invoke:String2 | sentence | {{#invoke:WikidataIB |getValue | P106 |fetchwikidata=ALL |list=hlist |qid=Q453196}} }} |
|
{{#invoke:String2| sentence | }} | |
{{#invoke:String2| title | abcd }} | Abcd |
{{#invoke:String2| title | abCD }} | Abcd |
{{#invoke:String2| title | ABcd }} | Abcd |
{{#invoke:String2| title | ABCD }} | Abcd |
{{#invoke:String2| title | }} | |
{{#invoke:String2| title | the vitamins are in my fresh california raisins}} | The Vitamins Are in My Fresh California Raisins |
Posnq
Template:Posnq is a convenience wrapper for the posnq function.
{{Posnq |This is a piece of text to be searched |piece of }}
→ Template:Posnq{{Posnq |This is a piece oftext to be searched |piece of }}
→ Template:Posnq{{Posnq |This is a piece of text to be searched |"piece of "}}
→ Template:Posnq{{Posnq |This is a piece oftext to be searched |"piece of "}}
→ Template:Posnq
Stringsplit
Template:Stringsplit is a convenience wrapper for the split function.
{{Stringsplit |This is a piece of text to be split |" "}}
→ Template:Stringsplit{{Stringsplit |This is a piece of text to be split |" "| 4}}
→ Template:Stringsplit{{Stringsplit |This is a piece of text to be split |x| 2}}
→ Template:Stringsplit
Modules may return strings with | as separators like this: {{#invoke:carousel | main | name = WPDogs | switchsecs = 5 }}
→ Racibórz 2007 082.jpg | English Bulldog, Racibórz, Poland
{{Stringsplit |{{#invoke:carousel | main | name = WPDogs | switchsecs = 5 }}|{{!}}| 2}}
→ Template:Stringsplit
Lua patterns can allow splitting at classes of characters such as punctuation:
{{Stringsplit |Apples, pears, oranges; Cats, dogs|"%p"| 2 |false}}
→ Template:Stringsplit{{Stringsplit |Apples, pears, oranges; Cats, dogs|"%p"| 4 |false}}
→ Template:Stringsplit
Or split on anything that isn't a letter (no is treated as false):
{{Stringsplit |Apples pears oranges; Cats dogs|"%A+"| 4 |no}}
→ Template:Stringsplit
Named parameters force the trimming of leading and trailing spaces in the parameters and are generally clearer when used:
{{Stringsplit | txt=Apples pears oranges; Cats dogs | sep="%A+" | idx=3 | plain=false }}
→ Template:Stringsplit
One2a
Template:One2a is a convenience wrapper for the one2a function.
Capitalisation is kept. Aimed for usage with {{Convert}}.
{{one2a |One foot. One mile. One kilometer. One inch.One amp. one foot. one mile. one inch. Alone at last. Onely the lonely. ONE ounce. One monkey.}}
→
{{convert|1|ft|spell=on}}
→ Template:Convert{{one2a|{{convert|1|ft|spell=on}}}}
→ Template:One2a{{convert|2.54|cm|0|disp=out|spell=on}}
→ Template:Convert{{one2a|{{convert|2.54|cm|0|disp=out|spell=on}}}}
→ Template:One2a
See also
Module:String for the following functions:
- len
- sub
- sublength
- match
- pos
- str_find
- find
- replace
- rep
Templates and modules related to capitalization Template:Case templates see also
Templates that implement <nowiki>
local p = {} p.trim = function(frame) return mw.text.trim(frame.args[1] or "") end p.sentence = function (frame) -- {{lc:}} is strip-marker safe, string.lower is not. frame.args[1] = frame:callParserFunction('lc', frame.args[1]) return p.ucfirst(frame) end p.ucfirst = function (frame ) local s = mw.text.trim( frame.args[1] or "" ) local s1 = "" -- if it's a list chop off and (store as s1) everything up to the first <li> local lipos = mw.ustring.find(s, "<li>" ) if lipos then s1 = mw.ustring.sub(s, 1, lipos + 3) s = mw.ustring.sub(s, lipos + 4) end -- s1 is either "" or the first part of the list markup, so we can continue -- and prepend s1 to the returned string local letterpos if mw.ustring.find(s, "^%[%[[^|]+|[^%]]+%]%]") then -- this is a piped wikilink, so we capitalise the text, not the pipe local _ _, letterpos = mw.ustring.find(s, "|%A*%a") -- find the first letter after the pipe else letterpos = mw.ustring.find(s, '%a') end if letterpos then local first = mw.ustring.sub(s, 1, letterpos - 1) local letter = mw.ustring.sub(s, letterpos, letterpos) local rest = mw.ustring.sub(s, letterpos + 1) return s1 .. first .. mw.ustring.upper(letter) .. rest else return s1 .. s end end p.title = function (frame ) -- http://grammar.yourdictionary.com/capitalization/rules-for-capitalization-in-titles.html -- recommended by The U.S. Government Printing Office Style Manual: -- "Capitalize all words in titles of publications and documents, -- except a, an, the, at, by, for, in, of, on, to, up, and, as, but, or, and nor." local alwayslower = {['a'] = 1, ['an'] = 1, ['the'] = 1, ['and'] = 1, ['but'] = 1, ['or'] = 1, ['for'] = 1, ['nor'] = 1, ['on'] = 1, ['in'] = 1, ['at'] = 1, ['to'] = 1, ['from'] = 1, ['by'] = 1, ['of'] = 1, ['up'] = 1 } local res = '' local s = mw.text.trim( frame.args[1] or "" ) local words = mw.text.split( s, " ") for i, s in ipairs(words) do -- {{lc:}} is strip-marker safe, string.lower is not. s = frame:callParserFunction('lc', s) if i == 1 or alwayslower[s] ~= 1 then s = mw.getContentLanguage():ucfirst(s) end words[i] = s end return table.concat(words, " ") end -- findlast finds the last item in a list -- the first unnamed parameter is the list -- the second, optional unnamed parameter is the list separator (default = comma space) -- returns the whole list if separator not found p.findlast = function(frame) local s = mw.text.trim( frame.args[1] or "" ) local sep = frame.args[2] or "" if sep == "" then sep = ", " end local pattern = ".*" .. sep .. "(.*)" local a, b, last = s:find(pattern) if a then return last else return s end end -- stripZeros finds the first number and strips leading zeros (apart from units) -- e.g "0940" -> "940"; "Year: 0023" -> "Year: 23"; "00.12" -> "0.12" p.stripZeros = function(frame) local s = mw.text.trim(frame.args[1] or "") local n = tonumber( string.match( s, "%d+" ) ) or "" s = string.gsub( s, "%d+", n, 1 ) return s end -- nowiki ensures that a string of text is treated by the MediaWiki software as just a string -- it takes an unnamed parameter and trims whitespace, then removes any wikicode p.nowiki = function(frame) local str = mw.text.trim(frame.args[1] or "") return mw.text.nowiki(str) end -- posnq (position, no quotes) returns the numerical start position of the first occurrence -- of one piece of text ("match") inside another ("str"). -- It returns nil if no match is found, or if either parameter is blank. -- It takes the text to be searched in as the first unnamed parameter, which is trimmed. -- It takes the text to match as the second unnamed parameter, which is trimmed and -- any double quotes " are stripped out. p.posnq = function(frame) local args = frame.args local pargs = frame:getParent().args for k, v in pairs(pargs) do args[k] = v end local str = mw.text.trim(args[1] or args.source or "") local match = mw.text.trim(args[2] or args.target or ""):gsub('"', '') if str == "" or match == "" then return nil end local plain = mw.text.trim(args[3] or args.plain or "") if plain == "false" then plain = false else plain = true end local nomatch = mw.text.trim(args[4] or args.nomatch or "") -- just take the start position local pos = mw.ustring.find(str, match, 1, plain) or nomatch return pos end -- split splits text at boundaries specified by separator -- and returns the chunk for the index idx (starting at 1) -- #invoke:String2 |split |text |separator |index |true/false -- #invoke:String2 |split |txt=text |sep=separator |idx=index |plain=true/false -- if plain is false/no/0 then separator is treated as a Lua pattern - defaults to plain=true p.split = function(frame) local args = frame.args if not(args[1] or args.txt) then args = frame:getParent().args end local txt = args[1] or args.txt or "" if txt == "" then return nil end local sep = (args[2] or args.sep or ""):gsub('"', '') local idx = tonumber(args[3] or args.idx) or 1 local plain = (args[4] or args.plain or "true"):sub(1,1) plain = (plain ~= "f" and plain ~= "n" and plain ~= "0") local splittbl = mw.text.split( txt, sep, plain ) if idx < 0 then idx = #splittbl + idx + 1 end return splittbl[idx] end -- val2percent scans through a string, passed as either the first unnamed parameter or |txt= -- it converts each number it finds into a percentage and returns the resultant string. p.val2percent = function(frame) local args = frame.args if not(args[1] or args.txt) then args = frame:getParent().args end local txt = mw.text.trim(args[1] or args.txt or "") if txt == "" then return nil end local function v2p (x) x = (tonumber(x) or 0) * 100 if x == math.floor(x) then x = math.floor(x) end return x .. "%" end txt = txt:gsub("%d[%d%.]*", v2p) -- store just the string return txt end -- one2a scans through a string, passed as either the first unnamed parameter or |txt= -- it converts each occurrence of 'one ' into either 'a ' or 'an ' and returns the resultant string. p.one2a = function(frame) local args = frame.args if not(args[1] or args.txt) then args = frame:getParent().args end local txt = mw.text.trim(args[1] or args.txt or "") if txt == "" then return nil end txt = txt:gsub(" one ", " a "):gsub("^one", "a"):gsub("One ", "A "):gsub("a ([aeiou])", "an %1"):gsub("A ([aeiou])", "An %1") return txt end -- findpagetext returns the position of a piece of text in a page -- First positional parameter or |text is the search text -- Optional parameter |title is the page title, defaults to current page -- Optional parameter |plain is either true for plain search (default) or false for Lua pattern search -- Optional parameter |nomatch is the return value when no match is found; default is nil p._findpagetext = function(args) -- process parameters local nomatch = args.nomatch or "" if nomatch == "" then nomatch = nil end -- local text = mw.text.trim(args[1] or args.text or "") if text == "" then return nil end -- local title = args.title or "" local titleobj if title == "" then titleobj = mw.title.getCurrentTitle() else titleobj = mw.title.new(title) end -- local plain = args.plain or "" if plain:sub(1, 1) == "f" then plain = false else plain = true end -- get the page content and look for 'text' - return position or nomatch local content = titleobj:getContent() return mw.ustring.find(content, text, 1, plain) or nomatch -- returns multiple values end p.findpagetext = function(frame) local args = frame.args local pargs = frame:getParent().args for k, v in pairs(pargs) do args[k] = v end if not (args[1] or args.text) then return nil end -- just the first value return (p._findpagetext(args)) end -- returns the decoded url. Inverse of parser function {{urlencode:val|TYPE}} -- Type is: -- QUERY decodes + to space (default) -- PATH does no extra decoding -- WIKI decodes _ to space p._urldecode = function(url, type) url = url or "" type = (type == "PATH" or type == "WIKI") and type return mw.uri.decode( url, type ) end -- {{#invoke:String2|urldecode|url=url|type=type}} p.urldecode = function(frame) return mw.uri.decode( frame.args.url, frame.args.type ) end -- what follows was merged from Module:StringFunc -- helper functions p._GetParameters = require('Module:GetParameters') -- Argument list helper function, as per Module:String p._getParameters = p._GetParameters.getParameters -- Escape Pattern helper function so that all characters are treated as plain text, as per Module:String function p._escapePattern( pattern_str) return mw.ustring.gsub( pattern_str, "([%(%)%.%%%+%-%*%?%[%^%$%]])", "%%%1" ); end -- Helper Function to interpret boolean strings, as per Module:String p._getBoolean = p._GetParameters.getBoolean --[[ Strip This function Strips characters from string Usage: {{#invoke:String2|strip|source_string|characters_to_strip|plain_flag}} Parameters source: The string to strip chars: The pattern or list of characters to strip from string, replaced with '' plain: A flag indicating that the chars should be understood as plain text. defaults to true. Leading and trailing whitespace is also automatically stripped from the string. ]] function p.strip( frame ) local new_args = p._getParameters( frame.args, {'source', 'chars', 'plain'} ) local source_str = new_args['source'] or ''; local chars = new_args['chars'] or '' or 'characters'; source_str = mw.text.trim(source_str); if source_str == '' or chars == '' then return source_str; end local l_plain = p._getBoolean( new_args['plain'] or true ); if l_plain then chars = p._escapePattern( chars ); end local result; result = mw.ustring.gsub(source_str, "["..chars.."]", '') return result; end --[[ Match any Returns the index of the first given pattern to match the input. Patterns must be consecutively numbered. Returns the empty string if nothing matches for use in {{#if:}} Usage: {{#invoke:String2|matchAll|source=123 abc|456|abc}} returns '2'. Parameters: source: the string to search plain: A flag indicating that the patterns should be understood as plain text. defaults to true. 1, 2, 3, ...: the patterns to search for ]] function p.matchAny(frame) local source_str = frame.args['source'] or error('The source parameter is mandatory.') local l_plain = p._getBoolean( frame.args['plain'] or true ) for i = 1, math.huge do local pattern = frame.args[i] if not pattern then return '' end if mw.ustring.find(source_str, pattern, 1, l_plain) then return tostring(i) end end end return p