Jump to content

Wikipedia talk:Categorization

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This term "diffusing"

[edit]

...feels very abstract and strange to me. I do understand what you guys mean by it, but "subcategorization" feels *much* more natural as a term for where entries can be, well, subcategorized. And instead of reusing this term to *also* mean mutually exclusive subcategories, i.e. if you put an entry into one subcategory then you shouldn't also put it into another, why not simply say that? I will also note that we still need a term for the specific meaning "the main category can (or cannot) contain entries", since "diffusing" doesn't appear to be used to differentiate between "if you put an entry in subcategory A1 you can or cannot put it also in subcategory A2" and "if you put an entry in subcategory A1 you can or cannot put it also in the main category A".

I guess it feels both sloppy and overly technical at the same time: we use "diffuse" both as a noun and a verb, but it still has multiple meanings! Either we don't invent special terms at all, or we invent terms that are precise and specific.

Several templates (linked to here) appear to already ditch the term for the more direct and natural alternatives. How about rewriting this article to get rid of the "diffusing" term altogether?

A sentence like Note that some categories can be non-diffusing on some parents, and diffusing on others. stands out like a sore thumb to me. I'm a programmer, I recognize when someone is writing technical documentation without a sufficiently developed sense for what the layman can digest... The way it starts with "Note that" suggests this is helpful clarifying context, but then it couldn't use more obtuse and abstract language if it tried. What it is trying to say could be sooo much more simply said, without using this needlessly complicated language:

Note that a certain group of articles can be subcategorized both into exclusive and non-exclusive subcategories at the same time. For example, when we break down Category:Women novelists by nationality we put each entry into only one subcategory and we leave none in the main category; while when we break down Category:British novelists into Category:British women novelists we don't do either of those things, because of the issue of othering discussed above.

Also, is this example truly a good one to use? Looking at Category:Women novelists by nationality, why isn't container categories discussed in this context. I mean, doesn't the message This is a container category. Due to its scope, it should contain only subcategories. kind of ignore the message we're trying to send here - that scope should not always be used to argue that the main category should be left empty?

CapnZapp (talk) 11:44, 30 May 2025 (UTC)[reply]

I wonder if this confusion is common enough to undertake the work of re-wording the guideline, all the categories, and all other pages that refer to this guideline or the concept itself. I haven't encountered many editors who are confused by our use of the term "diffusing", though the concept and its practical application certainly generate confusion (many such discussions in the archive). Firefangledfeathers (talk / contribs) 15:36, 30 May 2025 (UTC)[reply]
  • I'd rather propose to abolish non-diffusing categories altogether. They are entirely counterintuitive to the general rule that articles should be in the most specific categories possible. Non-diffusing categories primarily exist to solve the WP:FINALRUNG problem but a more intuitive solution for that problem is upmerging any "ghetto" categories. Marcocapelle (talk) 15:51, 30 May 2025 (UTC)[reply]
    The "ghetto" categories should exist, so that people can find people in underrepresented groups when that intersection is notable. And in many cases they are large. Upmerging is not an option. But ghettoizing people only in those categories is also a problem. Hence, non-diffusing categories. —David Eppstein (talk) 18:19, 30 May 2025 (UTC)[reply]
    For clarity, are you saying that non-diffusing categories only apply to articles about people? - jc37 07:14, 1 June 2025 (UTC)[reply]
    No, I am merely stating that upmerging is not a solution because it would go against the "ghettoization" rationale for non-diffusing categories about people.
    If you had read WP:NONDIFFUSING, you would see that another rationale is for "subsets which have some special characteristic of interest" where that special characteristic would not justify removing them from the main category. Current examples not about people include Category:1912 lost films and Category:Sega CD-only games; I have no strong opinion about whether they should remain non-diffusing. —David Eppstein (talk) 07:36, 1 June 2025 (UTC)[reply]
    Smiles at "If you had read..."  : )
    Anyway, I was asking you what you think the usage should be. Thank you for clarifying. - jc37 08:59, 1 June 2025 (UTC)[reply]
  • I think we need to avoid using overly charged words in discussing non-diffusing rules. There are a lot of conflicting meanings to these words. We should avoid inflammatory rhetoric. We also need to assume good faith far more than we do. More to the point we need to either alter the rules to agree with actual use or alter use to agree with the rules, or some of both.John Pack Lambert (talk) 13:39, 19 June 2025 (UTC)[reply]
  • Our rules say ERGS categories are always non-diffusing. Then we say that sports categories can be an exemption. The reality is that 1-we have diffused people in acting and some people in singing, and I believe de facto people in modeling and competitors in beauty pageants almost completely. There may be a few cases that occur in the other parts of ERGS bit they are rare. I think we should state something in the rule "in singing voice range and name is largely defined by gender,and the roles, actions and careers of singers are so heavily defined by gender, that someone can be in say Category:French women singers or Category:French male singers without needing to be in a gender neutral category as well. This rule does not apply to musicians who are not singers. The last rung rule still applies of ERS grounds. So someone should only be in African-American women singers if she is also in a sub-cat of American eomrn singers that is open to all ethnicities. For example we used to have 20th-century African-American women opera singers, but 20th-century American women singers was only had that as a diffusing sub-cat, so we upmerged 20th-century African-American women opera singers to 20th-century American women opera singers, African-American women opera singers and 20th-century African-American women singers. Similar rules to singing also apply to dance, modeling, acting and beauty pageants, where people can be in only gendered sub-cats. They for not apply to film directing, writing or other categories where while gender is defining for both males and females the roles and actions or male and female are not so defining they create clearly different roles. Sometimes code words are used to refer to a group of people of a certain ethnicity in a particular field especially musical genera. One example is "blue eyed soul" which is not a general but a description of people perceived to be ethnically from European origin who create soul music. This category was deleted because it is not an actual genera."John Pack Lambert (talk) 13:56, 19 June 2025 (UTC)[reply]
  • I think at times people have argued "we should keep this category that intersects ethnicity and occupation because these 2 sources discuss it as a group" while ignoring the last rung Category. If I could find 2 articles that are reliable sources on the broad topic of African-American pediatric surgeons I would still argue we should not create Category:African-American pediatric surgeons. Because there are no sub-categories of Anerican pediatric surgeons at all. I wonder if we need to define non-diffusing better though. Category:American pediatricians has 5 specialty sub-cats, but combined those categories only take in about 20% of the Category. That means that roughly 80% of the people in Category:American womrn pediatrians are also in Category:American pediatricians directly. In theory if we diffused American pediatricians by state or by century this overlap would be a lot lower. I am not sure either such diffusion is wise, doctors move around too much to make diffusion by state workable, unless we clearly say categorize by where most of the persons post-resudency career was, not by where they were born, raised or received medical education or did residency. That would still only average about 20 per state, and might not even get all states, so I have doubts. Centuries I think would have high overlap and at most 3 categories. I think considering the size of American women physicians (well over 1000 I think, it has 266 base articles, about 650 in specialities, but then there are by century sub-cats, which in theory have almost 1000 but may have lots of overlap, and I have no idea what percent of articles in cat by both specialty and century). We also have a physicians by US state tree, I have no idea how we decide which state or state to place people in but at present there are no medical specialities split by state, nor are the state cats divided by century. American women cardiologists is probably a closer last rung issue. The main category has 227 articles with only 16 in diffusing sub-specialties. The other sub-cats are for society leadership or being editors in the field which I am not sure we actually diffuse, so this is only about 20% diffusing. Of the 26 people in American women cardiologists, 25 are also directly in American cardiologists and 1 is in American pediatric cardiologists (she was not in American eomen pediatrians so I added her there). This is as applied essentially a last rung Category. I think we should upmerge American women cardiologists to women cardiologists and American women physicians, since all articles are already in American cardiologists or a gender neutral sub-cat. This really makes sense since there are only 33 direct articles in women cardiologists and only an American sub-cat. With only 59 articles on women cardiologists it makes no sense to sub-divide it by nationality, and it is not normally reasonable to have just one nationality sub-cat in a field (there are rare exceptions, but this is not one). We might want to come up with a guideline as to the percentage of diffusion that makes something no longer a last rung, or the percentage of overlap between an ERGS sub-cat and the non-diffusing parent that in theory has non-ERGS sub-cats at which it really is no longer ger last rung. Or maybe we do not want firm percentages and should just have language to tell people that just because something has a non-ERGS sub-cat does not mean it is not the last rung if almost all the contents are in that category.John Pack Lambert (talk) 14:41, 19 June 2025 (UTC)[reply]
    • The TL:DR summary of the above is that I think Category:American women cardiologists should be upmerged to Category:American women physicians and Category:Women cardiologists (but not Catehory:American cardiologist because all articles are already in either Category:American cardiologists or in 1 case the diffusing sub-cat Category:American pediatric cardiologists). Category:Women cardiologists only has 59 articles in the whole tree with only 1 sub-cat so it is too small to diffuse by nationality and Category:American cardiologists has so few diffusing sub-categories that take in such a small percentage of the articles in the category that diffusing it by gender violates the sprit of the no last-rung ERGS rule.John Pack Lambert (talk) 14:41, 19 June 2025 (UTC)[reply]

"O'" rule for first names

[edit]

NAMESORT is unspecific in regards to this so I figured I should ask: does the exception for excluding apostrophes in names that start with "O'" only apply to family names, or should someone with such a name as their given name (e.g. O'Donel Levy) follow the rule as well? QuietHere (talk | contributions) 02:56, 21 June 2025 (UTC)[reply]

Can a bot help rename?

[edit]

Can a bot help rename Category:The 50,000 Challenge to Category:The 20,000 Challenge? ☆ Bri (talk) 15:40, 16 August 2025 (UTC)[reply]

@Bri: Please follow the directions at WP:CFDS. --Redrose64 🌹 (talk) 10:21, 17 August 2025 (UTC)[reply]

Bug in the documentation or the implementation?

[edit]

In the topic of sortkeys, it currently says

Furthermore, other general articles that are highly relevant to the category should be sorted with an asterisk as key so that they also appear at the top of a category but beneath the main article/s. Example: [[Category:Example|*]] Those articles are typically called "History of example", "Types of example", "List of example" or similar.

but if you use asterisk as the sortkey, at the start of the list of articles in the category you get the list article (for the sake of an example) appearing after a bullet (which is fine) but with an extraneous asterisk on the line above (huh?). If you just use a space as the sortkey, you get a bullet followed the list article but without the extraneous asterisk on the line above. So this is either an error in the documentation or a bug in the code, but I think one or other needs to be updated. Kerry (talk) 03:57, 30 August 2025 (UTC)[reply]

@Kerry Raymond: It's neither an error nor a bug. Category pages have up to three headings: Subcategories; Pages in category; and Media in category. Under each of these headings, there will be one or more subheadings, each being a single character - the first character of the sortkey for the pages listed below that subheading. So pages with the letter "A" at the start of the sortkey are placed below the "A" subheading; pages with the letter "B" at the start of the sortkey are placed below the "B" subheading; and so on. The only exceptions are pages where the sortkey begins with a digit: these are grouped together under an "0–9" subheading. It follows that pages with an asterisk at the start of the sortkey are placed below the "*" subheading; and less obviously, pages with a space at the start of the sortkey are placed below the " " subheading. It's not easy to see, but it's there, as may be checked by using your browser's "View Page Source" feature - it will be the element <h3> </h3>. --Redrose64 🌹 (talk) 22:48, 30 August 2025 (UTC)[reply]

Human posing as machine imitating human

[edit]

I've been trying to fortify the article Mechanical Turk, now undergoing a featured article review. Its fate is unlikely to hang on its categorization, but this nevertheless interests me.

I think its current set of categories calls for the addition of Category:19th-century hoaxes, Category:19th century in chess, and perhaps also Category:19th-century robots (and removal of Category:History of chess as superfluous).

Wikipedia lacks categories for human-powered devices (I suppose the closest would be Category:Hand tools, irrelevant here), chess hustlers, itinerant entertainers, exhibits, or "curiosities". We do however have Category:1770 beginnings, Category:1770 introductions, Category:1770 works, Category:1854 endings, Category:1854 fires, Category:Fires in Philadelphia, the title of each of which sounds as if it would be at least moderately appropriate -- but for each, a quick look at the list of what is currently so categorized suggests that the Turk would be anomalous.

And yes, I'm a little tempted to add the Turk to Category:People from Bratislava, Category:People from Hietzing, and Category:Deaths from fire in the United States -- but Wikipedia is and must remain a solemn enterprise, so no I shan't.

Comments and suggestions welcome. -- Hoary (talk) 22:16, 2 September 2025 (UTC)[reply]

Allow navboxes in article categories

[edit]

Wikipedia:Categorization#Templates says: "Templates are not articles, and thus do not belong in content categories". I suggest an exception for navigation templates (navboxes) which correspond closely to a category. Navboxes are reader-facing and often give better or alternative navigation for a category, e.g. by ordering the items differently, piping out unnecessary title parts, adding information, and including informative section links, redirects and redlinks. Many navboxes are already in such article categories because editors found it practical. Consider for example Category:2025 ATP Tour versus {{2025 ATP Tour}}. The category is alphabetical and rather large and messy for many purposes. The navbox is very ordered. It organizes tournaments by tournament category with the largest on top. Each category is ordered chronologically. Most parts of the titles are piped out so you get a compact navbox full of information, including tournaments which don't have articles yet. Official sponsored names like 2025 BMW Open which can change from year to year are replaced with the city like Munich. I actually find this navbox so helpful that I would even support transcluding it on the category page but that may be going too far in general. I suggest navboxes in closely associated content categories are sorted under * at the start to help navigation and not under τ at the end as suggested for templates in general at WP:SORTKEY which also says: "Furthermore, other general articles that are highly relevant to the category should be sorted with an asterisk as key so that they also appear at the top of a category but beneath the main article/s." The discussed navboxes are highly relevant to the category. PrimeHunter (talk) 14:28, 10 September 2025 (UTC)[reply]