Difference between revisions of "Help:Sorting"

MyWikiBiz, Author Your Legacy — Friday April 19, 2024
Jump to navigationJump to search
 
Line 829: Line 829:
  
 
==Secondary sortkey==
 
==Secondary sortkey==
If a column contains a value multiple times then sorting the column preserves the order of the rows within each subset that has the same value in that column ({{mlw|Sorting_algorithm|Stability|stable sorting}}). Thus sorting based on a primary, secondary, tertiary, etc. sortkey can be done by sorting the least-significant sortkey first, etc.
+
If a column contains a value multiple times then sorting the column preserves the order of the rows within each subset that has the same value in that column ({{ml|Sorting_algorithm|Stability|stable sorting}}). Thus sorting based on a primary, secondary, tertiary, etc. sortkey can be done by sorting the least-significant sortkey first, etc.
  
 
'''First click on column Alphabet and then on Numbers, you'll see that the ordering is on Numbers (1), Alphabet (2).'''
 
'''First click on column Alphabet and then on Numbers, you'll see that the ordering is on Numbers (1), Alphabet (2).'''

Latest revision as of 11:14, 6 June 2007

Tables can be made sortable via client-side JavaScript with class="sortable". Sortable tables are identified by the arrows in each of its header cells. Clicking them will cause the table rows to sort based on the selected column, in ascending order first, and subsequently toggling between ascending and descending order. Links and other wiki-markup are not possible in headers.

Sort modes

The sort modes (the data types, which, in addition to the choice "ascending" or "descending", determine the sorting order) are as follows; in the given order (as soon as there is a match, subsequent criteria are not applicable, e.g., 24-12-2007 is a date, not a number. For the criteria, tags (e.g. span, sup, sub) are ignored.

  • date (see also below)
    • criterion: the first non-blank element is of the form "dd-dd-dddd", "dd-dd-dd", or "dd aaa dddd"; or, in the last case, text follows ending with "sm=d" (without the quotes; it stands for "sort mode = date"). (In hidden form this can conveniently be done with Template:Tim.)
    • order: the string abcdefghij of length 10 is positioned as ghijdeab, the string abcdefghijk of length 8 as 19ghdeab if gh>=50 (string comparison) and 20ghdeab otherwise (i.e., the assumed format is DD-MM-YYYY or DD-MM-YY), and the string "dd aaa dddd" with aaa an abbreviated month name: chronologically
  • "currency" (this mode can be useful for other data also)
    • criterion: the first non-blank element starts with $, £, €, or ¥; or the element ends with "sm=c" (without the quotes; it stands for "sort mode = currency"). (In hidden form this can conveniently be done with Template:Tim.)
    • order: numeric, ignoring these symbols and all ordinary letters and commas, but not spaces; note that scientific notation cannot be used, as e and E are removed
  • numeric
    • criterion: the first non-blank element consists of just digits, points, commas, spaces, "+", "-", possibly followed by "e" or "E" and a string consisting of "+", "-", digits, possibly followed by "×10" and a string consisting of "+", "-", digits (the latter is for the purpose of using hidden e-notationhttps://mywikibiz.com/Scientific_notation#E_notation followed by visible superscript notation of scientific notationhttps://mywikibiz.com/Scientific_notation); or the element ends with "sm=n" (without the quotes; it stands for "sort mode = numeric"). (In hidden form this can conveniently be done with Template:Tim.)
    • order: if the string starts with a number (where spaces and nbsp's at the start are ignored) the order is numeric according to the first number in the string (parseFloat is applied) after removing the commas, if any; if it does not (parseFloat returns NaN), the element is positioned like 0
proposed internationalisation: in German etc., treat comma as a decimal point
  • string
    • criterion: all other cases; to avoid one of the other modes, start e.g. with a hidden "&"; this can be done conveniently with Template:Tim, which also allows more hidden text, as sortkey; while the similar templates above are called at the end of a table element, call this one at the start
    • order: after conversion of capitals to lowercase the order is ASCII - partial list showing the order: !"#$%&'()*+,-./09:;<=>?@[\]^_'az{|}~é— (see also below; a blank space comes before every other character; an nbsp code counts as a space; two adjacent ordinary blank spaces count as one; for multiple blank spaces one can use nbsps or alternate nbsps and ordinary blank spaces)

The sort mode is determined by the table element that is currently in the first non-blank row below the header. (To minimize the deviation from wikibits.js, the code "ns=.." works in the first element below the header instead of in the header.) In the case of different data types within one column (according to the criteria mentioned above) the sort mode may change after sorting, which can give a cycle of four or even more instead of two. This is confusing and gives undesired sorting orders, so that can better be avoided. However, it can be complicated to assess whether an element can ever be at the top after any sorting operations on the same and other columns, and this can also change after deleting a row, or adding a column. Therefore it is wise to make sure that every element matches the criterion for the required data type. Using a row templatehttps://mywikibiz.com/Help:Table#Row_template this can be done very conveniently.

Examples

Text after a number (e.g. a footnote) does not affect the sorting order, if the sorting mode is numeric. However, if the number at the top has text after it, this makes the sorting mode alphabetic, unless it ends with a (typically invisible) "sm=n".

numbers
123,456,789
2,500,000,000
300,000,000
3,000,000 abc
5,000,000
2,000 def
-4,000
ghi
-9,999
4,000
9,999
800,000
900,000
numbers
123 564,589.7e12
9
-80
80 abc 5
abc 80
70
600
first alphabetic, later also numeric mode
123.4 ghi
2,500,000,000
300,000,000
3,000,000 abc
5,000,000
2,000 def
4,000
9,999
800,000
900,000
currencies
$ 9
$ 80
$ 70
$ 600
currencies
€ 9
€ 80
€ 70
€ 600
currencies
£ 9
£ 80
£ 70
£ 600
currencies
¥ 9
¥ 80
¥ 70
¥ 600
comparison
a 9
a 80
a 70
a 600
comparison
e 9
e 80
e 70
e 600
"currency" mode
9sm=c
circa 80sm=c
7089sm=c
70 90sm=c
70 91sm=c
70 89sm=c
70 to 90sm=c
7091sm=c
600sm=c
"currency" mode ("ca. 80" sorts at 0 due to dot)
9sm=c
ca. 80sm=c
7089sm=c
70 to 90sm=c
7091sm=c
600sm=c

The example with "a" gives alphabetic sorting; that with "e" ditto, the data are not mistaken for numbers in scientific format.

mixed notations
12 or 13 sm=n
-12 (retrograde)
12 (?)
ca. 12
?
1.4285714285714E+17
1000000000000000000
-1000000000000000000
.0000000000000000001
-.0000000000000000001
-1.4285714285714E+17
1.4285714285714E-13
-1.4285714285714E-13
89 123 456 788
89,123,456,789
333
1e10
e 9
e 80
e 70
e 600
999e9
88e80
7e270
999e-9
88e-80
7e-270
4e12×1012
-999e9
−999e9
-88e80
-7e270
-999e-9
-88e-80
-7e-270
e3
-e3
1e3
e9
e80
e270
2e12×1012
7e11×1011
6e11
8e11
7e-11×10-11
-7e11×1011
-7e-11×10-11
3e2×102
4e2×102
first number in each element counts
7-4
2
4
22/7
111
percentage
7%
2
4
22
111
mixed notations
14
-14
11
-12 (retrograde)
12 or 13
12 (?)
ca. 12
12 (approx.)
?

The first example demonstrates that text is positioned at zero, and that e.g. e3 for 1000 is not allowed, use 1e3 instead. Using Template:Tim a number in scientific notation can be displayed with a superscript, while still allowing numeric sorting (compare the method described in the next section). It also shows that "-" should be used, not "−". The first element "12 or 13 sm=n" has a visible "sm=n", although it is normally made invisible; when this element is at the top, numeric sorting mode applies.

The second example shows that expressions are not sorted according to their evaluated value, but according to the first number.

The third example shows that a percentage is accepted for numeric sorting mode, but ignored in the actual sorting, so if a column contains percentages, all numbers have to be written as a percentage.

The fourth example shows again that "ca. 12" sorts at 0, as opposed to 12 with some text after it, which sorts at 12. In the latter case an invisible "sm=n" needs to be put at the end, in case such an element arrives at the top of a column, which would otherwise cause alphabetic sorting mode.

Sortable version of Table:Climate in Middle East citieshttps://mywikibiz.com/Table:Climate_in_Middle_East_cities, using smn to specify numeric sort mode:

City January
(Low)
January
(High)
July
(Low)
July
(High)
Amman 4°Csm=n 12°Csm=n 18°Csm=n 32°Csm=n
Baghdad 0°Csm=n 16°Csm=n 24°Csm=n 43°Csm=n
Cairo 8°Csm=n 18°Csm=n 21°Csm=n 36°Csm=n
Damascus 0°Csm=n 12°Csm=n 16°Csm=n 36°Csm=n
Dubai 15°Csm=n 23°Csm=n 30°Csm=n 39°Csm=n
Jerusalem 5°Csm=n 13°Csm=n 17°Csm=n 31°Csm=n
Riyadh 8°Csm=n 21°Csm=n 26°Csm=n 42°Csm=n
Tehran -3°Csm=n 7°Csm=n 22°Csm=n 37°Csm=n

Additional features

Excluding the last row from sorting

Sometimes it is helpful to exclude the last row of a table from the sorting process.

This can be achieved using class="sortbottom" on the desired table row (line starting with |-).

Wiki markup

{|class="wikitable sortable"
!Name!!Surname!!Height
|-
|John||Smith||1.85
|-
|Ron||Ray||1.89
|-
|Mario||Bianchi||1.72
|-class="sortbottom"
|||Average:||1.82
|}

What it looks like in your browser

Name Surname Height
John Smith 1.85
Ron Ray 1.89
Mario Bianchi 1.72
Average: 1.82

Issues

Please note that only one row can be marked with class="sortbottom". Doing otherwise results in the marked rows being sorted with each other, which usually is not the intended behavior.

Wrong - using more than one "sortbottom"
Name Surname Height
John Smith 1.85
Ron Ray 1.89
Average: 1.82
Name Surname Height

Making a column unsortable

If you want a specific column not to be sortable, specify class="unsortable" in the attributes of its header cell.

Wiki markup

{|class="wikitable sortable"
!Numbers!!Alphabet!!Dates!!Currency!!class="unsortable"|Unsortable
|-
|1||Z||02-02-2004||5.00||This
|-
|2||y||13-apr-2005||||Column
|-
|3||X||17.aug.2006||6.50||Is
|-
|4||w||01.Jan.2005||4.20||Unsortable
|-
|5||V||05/12/2006||7.15||See?
|-class="sortbottom"
!Total: 15!!!!!!Total: 29.55!!
|-
|}

What it looks like in your browser

Numbers Alphabet Dates Currency Unsortable
1 Z 02-02-2004 5.00 This
2 y 13-apr-2005 Column
3 X 17.aug.2006 6.50 Is
4 w 01.Jan.2005 4.20 Unsortable
5 V 05/12/2006 7.15 See?
Total: 15 Total: 29.55 Original example

Sorting with hidden sortkey

If necessary one can apply sorting using a sortkey which due to CSS is not displayed:

<span style="display:none">...</span>

(However, on some projects, notably Ontoworld, a page with this wikitext cannot be saved, as spam protection.)

Javascript sorting is based on the text inside and outside the tags, without the tags themselves. A hidden sortkey can be put at the start. Both in the case of alphabetic and that of numeric sorting the first parts determine the order. For the purpose of a hidden sortkey for numeric sorting, the criterion for the item at the top being a number has been adapted: ignoring span and sup tags, it can be a number followed by "×10" and an exponent (use the same cross sign). This format is e.g. produced by Template:Tim. A hidden sortkey for alphabetic sorting does not have such restrictions.

Alphabetic sorting with hidden sortkey

The sortkey comes at the start and is separated from the displayed text in such a way that the latter does not affect the sorting order. For example, if a sortkey system is used where there are no blank spaces in any sortkey, then a blank space can be used for separation. If a single blank space is possible in a sortkey, two nbsps can be used. For table elements for which the text to be displayed is equal to the sortkey, no duplication is needed, of course.

If the text inside and outside the tags together is of a form that would cause a sorting mode other than alphabetic (if and when the element is at the top), a character can be appended at the end of the sortkey to avoid this, again making sure it does not affect the sorting order by putting a space or two nbsps. This can be dispensed with if the element can never be at the top, but this can be complicated to assess as that can be caused by sorting other columns, with varying sorting modes, and it can change when deleting a row, adding a column, etc.

Instead of "display=none" another way is using a font color equal to the background, e.g. <font color="#f9f9f9">999</font> gives "999". With this method the hidden code can be seen in selected text (e.g. with the mouse). Also the hidden text is included when copying the rendered text. The first may be an advantage or a disadvantage, the second seems only a disadvantage. A complication is also that if a user uses a background color different from the default, the specified text color may not match it; to make sure they are the same the background color can be specified also.

Unsuitability of padding with no-break spaces

The effect of left-padding with "&nbsp;" codes, which render as blank spaces, depends on the browser: in IE they are (unlike actual blank spaces) counted for sorting as leading blank spaces, so in a list of numbers with text (for which the alphabetic sorting mode applies) they could be used to equalize the number of characters before the explicit or implicit decimal separator. However, in Firefox they are ignored for the purpose of sorting.

Sorting using nbsps, works on IE but not on Firefox Name
100.3 FM Third
 89.5 FM First
107.3 FM Fourth
 95.3 FM Second

Padding with zeros

Example:

  • 000156

Formatnum can be combined with padleft:

Integer:

{{formatnum:{{padleft:299792458|16|0}}}} gives:

  • 0,000,000,299,792,458

Real:

{{formatnum:{{padleft:{{#expr:((299792458.056 - .5) round 0)}}|16|0}}}}.{{padleft:{{#expr:(1000000*(299792458.056 - ((299792458.056 - .5) round 0))) round 0}}|6|0}} gives:

  • 0,000,000,299,792,458.056000

Alphabetic sortkey for numeric sorting

Numeric sort mode can be forced with Template:Tim (see above). Numbers preceded with text are sorted like 0; to avoid that the text can be preceded by a hidden copy of the number.

If for some reason one wants to use alphabetic sort mode for numbers, one can construct a hidden alphabetic sortkey for this purpose. This can be done for all numbers between -1e100 and 1e100 in arbitrary precision as follows:

  • where scientific notation is used, it is normalized such that the absolute value of the mantissa is between 1 and 10; the exponent is put first
  • scientific notation is used for all negative numbers, and all positive numbers outside some interval (below: 1e-9 to 1e9), and not inside that interval
  • where the absolute value of the exponent and/or the mantissa is a decreasing function of the number, the notation uses its complement with respect to 99 for exponents and 10 for mantissas; the code "c" is added in these cases
  • numbers 0 ≤ x < 1000 get a "+" in front
  • positive numbers in scientific notation with a negative exponent get "+0" in front
  • spaces inside the code and &-signs in front are added where needed:
    • for numbers not in scientific notation the positions of all explicit and implicit decimal points are aligned
    • for the starting position, i.e. the position of the first "-", "+", or "e", of other numbers, see the example table
    • no code should satisfy the criterion for numeric sorting mode (below we have always either an ampersand or two letters e): although this matters only for the element at the top, any element might arrive at the top due to sorting another column

In the following the left column shows the code for alphabetic sorting, where cryptic followed by the regular notation. The second column contains the same (hence sorting the same), but with code hidden with CSS. The third column shows the corresponding plain numbers with thousands separators, equal to what the second column shows, now using numeric sorting mode.

full code for alphabetic sorting display form plain number
&&&&&&&&&+6 &&&&&&&&&+6 6
&&&&&&&&&+7 &&&&&&&&&+7 7
Expression error: Unrecognized punctuation character "[". Expression error: Unrecognized punctuation character "[". Template:Pow
&&&&&&1,234 &&&&&&1,234 1,234
&&&&&&&+123 &&&&&&&+123 123
Expression error: Unrecognized punctuation character "[". Expression error: Unrecognized punctuation character "[". Template:Pow
Expression error: Unrecognized punctuation character "[". Expression error: Unrecognized punctuation character "[". Template:Pow
Expression error: Unrecognized punctuation character "[". Expression error: Unrecognized punctuation character "[". Template:Pow
e23 6 6e23 e23 6 6e23 6e23
e09 1 1e9 e09 1 1e9 1e9
&&&&&&&&&+0 ec89 9.999,99 9.999,99e-10 &&&&&&&&&+0 ec89 9.999,99 9.999,99e-10 9.999,99e-10
&&&&&&&&&+0.000,000,001 &&&&&&&&&+0.000,000,001 0.000,000,001
&&&&&&&&&+0 ec87 6 6e-12 &&&&&&&&&+0 ec87 6 6e-12 6e-12
&&&&&&&&&+0 ec86 7 7e-13 &&&&&&&&&+0 ec86 7 7e-13 7e-13
&&&&&&&&&+0 ec87 5 5e-12 &&&&&&&&&+0 ec87 5 5e-12 5e-12
&&&&&&&&&&-e-10 c0.000,01 -9.999,99e-10 &&&&&&&&&&-e-10 c0.000,01 -9.999,99e-10 -9.999,99e-10
&&&&&&&&&&-e-08 c6.8 -3.2e-8 &&&&&&&&&&-e-08 c6.8 -3.2e-8 -3.2e-8
&&&&&&&&&&&-ec86 c0.3 -9.7e13 &&&&&&&&&&&-ec86 c0.3 -9.7e13 -9.7e13
&&&&&&&&&&&-ec99 c7.7 -2.3 &&&&&&&&&&&-ec99 c7.7 -2.3 -2.3
&&&&&&&&&+0 &&&&&&&&&+0 0
&&&&&&&&&+0.3 &&&&&&&&&+0.3 0.3

Dates

Date sort mode
07 Apr 2007
16 Apr 2007
16 Mar 2007
05-04-2007
04-05-2007
18 Mar 2007
27 Mar 2007
20 Aug 2006
22 Jul 2006
Date sort mode
07 Apr 2007 - 2 May 2007sm=d
16 Apr 2007
16 Mar 2007
05-04-2007
04-05-2007
18 Mar 2007
27 Mar 2007
20 Aug 2006
22 Jul 2006
Date sort mode, sorting works for no preference and preference dmy
07 Apr 2007
00 Jan 2007
00 Mar 2007
16 Apr 2007
28 Feb 2007
28 Feb 2007
28 Jan 2007
28 Jan 2007
07 Apr 2007
16 Apr 2007
1 Mar 2007
01 Mar 2007
27 Mar 2007
20 Aug 2006
22 Jul 2006
1 Mar 2007
01 Mar 2007
27 Mar 2007
20 Aug 2006
22 Jul 2006
String sort mode (edit to view source)
date
2006 a
2006-12-032006-12-03
-0000-03-27-0000-03-27
2006-12 December 2006
!9936-04 April 64 BC
!9900-07-13-0099-07-13
!9937-09-23-0062-09-23
!9937-10-08-0062-10-08
!9998-12-21-0001-12-21
2006-11-082006-11-08
0304-12-310304-12-31
2005-05-152005-05-15
Date sort mode
date
00-00-2006
03 Dec 2006
27 Mar 0000
22 Mar 000027 Mar 1 BCsm=d
00 Dec 2006
00 Apr !936 Apr 64 BCsm=d
13 Jul !90013 Jul 100 BCsm=d
23 Sep !93723 Sep 63 BCsm=d
08 Oct !93708 Oct 63 BCsm=d
21 Dec !99821 Dec 2 BCsm=d
08 Nov 2006
31 Dec 0304
15 May 2005

The sort mode is based on the rendered format; in the case of links: the labels, not the targets (though including any content hidden by "display:none").

Date sort mode:

One of the formats allowed for the date sort mode is produced by the Mediawiki's date-formatting feature in the right combination of preference and wikitext format: we need to use in the wikitext the format [[dd mmm]] [[yyyy]] (done in the example) and either no preference or preference dmy, or use with preference dmy one of the formats [[mmm dd]][[yyyy]], [[yyyy]][[mmm dd]], or [[yyyy]][[dd mmm]].

Incomplete dates:

  • <span style="display:none">00 Jan </span>2007
  • <span style="display:none">00</span> Mar 2007

String sort mode:

String sort mode provides chronological sorting for dates formated as <span style="display:none">&</span>YYYY-MM-DD; the hidden "&" avoids numeric sort mode.

Also we can hide the YYYY-MM-DD and put after that any choice of displayable text, including Mediawiki date formatting. The Wikipedia template Template:Tiw provides a convenient way of applying this method while using the date-formatting feature for display.

For years BC we can use, for example, !9937-09-23 for -0062-09-23 (subtract the year number BC from 10000, or the absolute value of the astronomical year from 9999).

If a table column contains any or all incomplete dates, this will not cause sorting problems. If only a year and month are given, that incomplete date is positioned alphabetically before the first day of the month in question. Likewise, if only a year is given, the date is positioned before the first month or day given for that year.

If at some point (i.e., after possible previous sorting) the form [[YYYY]] is at the top with a non-negative year, sorting would be numerical; in this case, after toggling between ascending and descending there would be no proper sorting within each year (because parsefloat is applied, finding the first number in the string, and basing sorting on only that number). Also, years BC would not be sorted properly. Therefore, alphabetic sorting has to be enforced. This can be done by putting a non-displayed character after the year, separated by a space.

Secondary sortkey

If a column contains a value multiple times then sorting the column preserves the order of the rows within each subset that has the same value in that column (stable sortinghttps://mywikibiz.com/Sorting_algorithm#Stability). Thus sorting based on a primary, secondary, tertiary, etc. sortkey can be done by sorting the least-significant sortkey first, etc.

First click on column Alphabet and then on Numbers, you'll see that the ordering is on Numbers (1), Alphabet (2).

Numbers Alphabet Dates Currency Text
4 a 01.Jan.2005 4.20 row 1
5 a 05/12/2006 7.15 row 2
1 b 02-02-2004 5.00 row 3
1 a 02-02-2004 5.00 row 4
2 x 13-apr-2005 row 5
2 a 13-apr-2005 row 6
3 a 17.aug.2006 6.50 row 7
3 z 17.aug.2006 6.50 row 8
Bottom

Limitations

Javascript sorting may not work properly on tables with cells extending over multiple rows and/or columns. Also, while cells can be empty, they should not be missing at the end of a row. In these cases sometimes the table gets messed up when attempting to sort, while other times some of the sorting buttons work while others don't.

Controlling sorting and display

Text undesired for sorting but needed for display:

  • In numeric sorting mode, this text (e.g. footnotes) needs to be put after the number, and an invisible "sm=n" after that. See e.g. Help:Sorting/countries.
  • In date sorting mode, this text needs to be put in a separate column; in the case of a cell containing a range of dates or numbers (e.g. from .. to ..), text in surplus of what is required for sorting is put in the extra column. If the first part of the text is used for sorting, then the extra column needs to be the following one; conversely, if the last part of the text is used for sorting, then the extra column needs to be the previous one; depending on the table format, this dividing of an item over two cells may look ugly.
  • In alphabetic sorting, any footnotes etc. do not require a separate column; they can simply be put at the end of the element.

Text undesired for display but needed for sorting:

  • can be put as hidden text in the column to be sorted

Combining the two, we can have displayed text independent of text used for sorting, by fully hiding the latter, and fully putting the former in a separate column (in date sorting mode and numeric sorting mode) or in the same column after the hidden text (in alphabetic sorting). Fully putting the displayed text in a separate column may look ugly if it is not done consistently for a whole column, but only for elements that require this (e.g. if most entries in a column are single numbers, but some are ranges).

Sorting the wikitext of a table

Unfortunately it does not seem possible to directly and automatically sort the wikitext itself, according to one of the sortkeys. This would, after saving, directly produce a table sorted as required.

However, if for a given table, we make an auxiliary sortable table rendering as wikitext for the original table, we can sort the wikitext of the original table.

Example:

Original table:

demo
9
12
11

Auxiliary table:

{|class="sortable" style="width:100%"
!demo

header
|-
| 9
|-
|12
|-
|11

|}

After copying the rendered text to the edit box, and deleting the header line, this renders as:

demo
9
11
12

Alphabetic sorting order

demo
!
"
#
$
%
&
'
(
)
*
+
,
-
.
/
0
9
:
;
<
=
>
?
@
[
\
]
^
_
'
A
Z
a
z
A1
Z1
a1
z1
{
|
}
~
É
é
É1
é1

The two-character entries such as A1 demonstrate that A and a are at the same position.