This page details the SVG extensions used by KanjiVG to add information on components of kanji, such as the radicals, as well as information on the expected shapes of strokes.

All of the SVG extensions used by KanjiVG are XML attributes with an added kvg: suffix, a "namespace" in XML terminology.

There are two root SVG groups. The StrokePaths group is a set of standard SVG paths that gives the strokes of the kanji. The strokes are ordered in the stroke order given by the references. It also contains groups which describe the structure of the kanji, such as its decomposition into elements, and which strokes comprise its radical.

The StrokeNumbers group is an optional group that gives a convenient position for stroke-order numbers, useful for displaying in printed material for instance. The stroke numbers are positioned near their corresponding strokes, at their starting points.

Undocumented and unknown features, and notes on possible flaws in the data, are distinguished with a pale green background.

StrokePaths groups

Kanji are often made of several components, referred to as elements in this documentation. For instance, can be seen as a combination of on its left and on its right. KanjiVG uses SVG groups to reflect this structure, so the KanjiVG entry contains two groups under the parent StrokePaths group containing the left and right side of the kanji. The 元 and 頁 are encoded in these groups under their element attributes. SVG groups provide an elegant way to collect strokes into a given group.

In the case that the elements do not consist of contiguous strokes, KanjiVG uses the part and number XML attributes to distinguish them.

This section explains the SVG attributes used in the groups under the StrokePaths group which describe the structure of a kanji, such as radicals and other sub-elements.

General attributes

id

The KanjiVG identification number for this group. It contains the Unicode value of the kanji as a five-digit lower-case hexadecimal number, followed by a hyphen and the letter "g", followed by a decimal number from one to the total number of groups.

The group ID numbers are always consecutive positive whole numbers.

KVG namespace attributes

All these attributes are placed under the kvg namespace, for example kvg:original="家"

element

This attribute specifies which kanji best represents the group physically. It should be the Unicode character that resembles the group as much as possible.

The value of element on the outermost group of the strokes is the same as the kanji represented by the SVG.

number

This relatively rare attribute allows an element of a kanji to be identified when it is both represented several times in the kanji, and, due to the stroke order, more than one of these representations is broken into parts, so that the part attribute has to be used for more than one element. In other words, the number attribute is a way to uniquely identify the part when it becomes ambiguous.

It is only used in a few places in kanjivg where there are two different sets of the same element, such as 05716.svg, the character 圖, where there are four 口 elements, two of which are broken into parts one and two due to the stroke order. Please inspect the source code of that SVG file to understand what kvg:number attribute does.

Generally, elements which can be represented by contiguous blocks of strokes do not have a number attribute, even if multiple cases of the same element occur in a character, so, for example, the 口 elements of 品 do not have a number attribute.

original

This attribute specifies which kanji represents the group from a semantic point of view. This attribute only needs to be present if there is a difference between the semantic and physical representation of the group.

For example, has two groups. The left one has (called ninben) for its element attribute, and , meaning "person", for its original attribute, because ninben is a variation of 人. However, the right side has for element, which is not a variation, so an original attribute is not necessary.

<g id="kvg:StrokePaths_04eee" style="fill:none;stroke:#000000;stroke-width:3;stroke-linecap:round;stroke-linejoin:round;">
<g id="kvg:04eee" kvg:element="仮">
	<g id="kvg:04eee-g1" kvg:element="亻" kvg:variant="true" kvg:original="人" kvg:position="left" kvg:radical="general">
		<path id="kvg:04eee-s1" kvg:type="㇒" d="M32.01,17c0.22,1.93-0.31,3.72-1.02,5.37C26.5,32.93,20.8,42.85,10.5,55.7"/>
		<path id="kvg:04eee-s2" kvg:type="㇑" d="M25.48,37.5c0.57,0.57,1,1.69,1,3.24c0,11.3,0,33.32,0,46.02c0,3.05,0,5.56,0,7.25"/>
	</g>
	<g id="kvg:04eee-g2" kvg:element="反" kvg:position="right" kvg:phon="叚V/反">
		<g id="kvg:04eee-g3" kvg:element="厂">
			<path id="kvg:04eee-s3" kvg:type="㇐" d="M47.34,22.01c1.27,0.33,3.61,0.53,4.86,0.33c11.42-1.84,23.3-4.59,32.75-5.84c2.08-0.27,3.38,0.16,4.44,0.32"/>
			<path id="kvg:04eee-s4" kvg:type="㇒" d="M52.65,23.81c1.14,1.14,1.49,2.48,1.42,4.7c-0.74,22.33-5.06,47.31-16.01,61.4"/>
		</g>
		<g id="kvg:04eee-g4" kvg:element="又">
			<path id="kvg:04eee-s5" kvg:type="㇇" d="M56.7,41.74c1.51,0.37,2.95,0.43,5.97-0.13c3.02-0.56,15.29-4.17,17.37-4.71c2.97-0.77,4.79,1.08,3.83,4.1c-6.62,20.75-11.62,38.63-36.26,53.52"/>
			<path id="kvg:04eee-s6" kvg:type="㇏" d="M56.12,52.12c5.64,0.81,18.99,22.02,31.03,33.71c2.35,2.29,5.22,4.66,7.84,6.11"/>
		</g>
	</g>
</g>
</g>
Illustration of the structure of the KanjiVG file for 仮 (U+4EEE).

part

When the elements of a group of kanji strokes which forms a larger unit are not consecutive strokes, the group of strokes may be spread over several groups of paths in the file. The part attribute allows numbering these groups and defines them as being part of the same component. There is also a number attribute which can be used in the rare cases that two groups with the same element have non-consecutive strokes within the same character.

partial

Should be present and set to true if the group only represents the element attribute partially, i.e. if not all its strokes are present.

phon

A large number of kanji consist of a radical and a phoneticum, the Sino-Japanese pronunciation. The phon attribute should mark the part indicating the pronunciation.

The values of this attribute are inconsistent, and the meanings of many of them are completely undocumented. See issue 312 on Github for more details.

position

Defines where this groups is located with respect to the other groups with the same parent. Not every element has a "position" value. Possible values are

bottom
This part is under another part.
kamae
This part is wrapped around another part, such as 門. This is used very inconsistently in KanjiVG as a grab-bag for various different structures.
left
This part is left of another part.
nyo
This part is left and under another part, such as 辶.
nyoc
This part is the complement or counterpart of a nyo part.
This part is right of another part.
tare
This part is left and above another part, such as 广.
tarec
This part is the complement or counterpart of a tare part.
top
This part is above another part.

radical

This is set to a value if this group of strokes is considered a radical of the kanji, and by which reference. The value of the attribute depends on the reference, as follows.

general
The generally accepted radical which authors agree on.
jis
This marks the radicals used by JIS Kanji Jiten, used by Kanjidic, which sometimes differ from the general or tradit radicals. This value was added to deal with inconsistencies between KanjiVG and Kanjidic and other references.
nelson
The keyword "nelson" is used for Nelson radicals.
tradit
The keyword "tradit" is used for the "traditional" radical, where the Kangxi radical disagrees with Nelson.
<g id="kvg:04e94" kvg:element="五">
	<g id="kvg:04e94-g1" kvg:element="二" kvg:part="1" kvg:radical="tradit">
		<g id="kvg:04e94-g2" kvg:element="一" kvg:radical="nelson">
			<path id="kvg:04e94-s1" kvg:type="㇐" d="M31.75,23.15c2.8,0.67,5.54,0.42,8.36,0.12c9.3-0.99,22.18-2.4,34.14-3.21c2.49-0.17,5.04-0.33,7.5,0.2"/>
		</g>
	</g>
	<path id="kvg:04e94-s2" kvg:type="㇑a" d="M55.75,25.25c0.62,1.25,1.02,3.01,0.5,5c-3.12,11.88-14,44.12-19.75,59"/>
	<path id="kvg:04e94-s3" kvg:type="㇕c" d="M25.5,55.25c2.07,1.24,4.73,1.03,7,0.81c15.49-1.45,29.89-3.03,42.25-4.06c3-0.25,4.25,1.75,3.5,3.75c-2.24,5.96-6,20.75-7.75,31.5"/>
	<g id="kvg:04e94-g3" kvg:element="二" kvg:part="2" kvg:radical="tradit">
		<path id="kvg:04e94-s4" kvg:type="㇐" d="M11.25,90.5c3.04,0.81,6.52,0.63,9.63,0.41c15.71-1.1,43.9-2.8,67.75-3.8c3.41-0.14,6.9-0.4,10.25,0.39"/>
	</g>
</g>
The kanji for "five", 五, has a traditional Kangxi radical 二, which is split into two parts, the upper and lower character, and a Nelson radical of 一, which is the upper character. The two parts of the 二 are given part numbers of 1 and 2.

Unicode has more than one code point which may represent each radical. The choices of radicals which have been used by KanjiVG are explained on the Radicals page.

radicalForm

This is set to the value true for a limited number of groups where a radical-like form of a character described by original is provided as the element.

tradForm

The Kanjidic file with which Ulrich Apel worked in the beginning favored the radicals given in the Nelson character dictionary, which sometimes differ from the radicals given in "traditional" Japanese dictionaries and have mark-up as well.

variant

Should be present and set to true if the group is from the element attribute.

Strokes

Each individual kanji stroke is represented by one SVG <path> element.

General attributes

d

The SVG path information itself. This describes the shape of the line.

Although there is no rule disallowing various SVG elements, in practice all of the KanjiVG data consists of cubic bezier curves. In the SVG terminology the path is made up of only M/m, C/c, and S/s elements. There are no other SVG path elements present. None of the strokes contains a path with more than one sub-path, that is to say there are no strokes with more than one "moveto" element.

id

The KanjiVG identification number for this stroke. It contains the prefix kvg: followed by the Unicode value of the kanji as a five-digit lower-case hexadecimal number, followed by any variant information, followed by a hyphen and the letter "s", followed by a decimal number from one to the total stroke count. For example stroke 3 of the file kanji/053ec.svg has the ID number kvg:053ec-s3.

The stroke IDs are consecutive positive whole numbers starting from 1 which correspond to the stroke number of the stroke.

KVG namespace attributes

These attributes are under the kvg: namespace.

type

The shape of the stroke. It can be used to know how the stroke should be rendered.

The values of this attribute use the keys of Unicode's CJK Strokes, which occupy code positions from U+31C0 to U+31EF. The names of these, such as D or HZ, are the initials of the Chinese names.

Please see the Stroke types page for full information on stroke types.

Stroke numbers

Stroke numbers are represented by a top-level group with an ID of the form kvg:StrokeNumbers_abcde, where abcde is the identifier of the file. This group contains text elements. Each text element is located on the diagram using a transform attribute. The text within each text element is the stroke number in digits, from one to the total number of strokes. The stroke numbers should correspond to the id value of the individual strokes.

The stroke numbers are located to the side of the beginning of the stroke whose order they indicate. Generally, they should not overlap the strokes.