The distribution of a quantity is of Type String. A regulation where different distributions have unambiguous identifiers may be better. The problem is however, that these identifiers have to be defined by ourselves, as no reference was found dealing with easy to use unit identifiers.
Edited
Designs
Child items ...
Show closed items
Linked items 0
Link issues together to show that they're related.
Learn more.
The following comment was added by NPL to the discussion:
It would be interesting to see statistical distribution to be implemented as enumeration of a list, say {dist1(para1, para2), dist2(para1) …} = {Guassian(mu, sigma), poison(lamda)…}. The implementation could share some similarity with the C language pointer to function? However, it would be difficult to exhaust all possible statistical distribution. For those distributions that not listed, will Latex style syntax be useful? As it is well known to most of the users in scientific field and the syntax is well developed.
What do you think, can we make up a proposal for an implementation unit next monday?
Enumerations have the disadvantage, that we can not define individial parameter values like "dist1(par1,par2,par3)".
I started to elaborate a different idea:
The enumeration like structure may be realized with a regaular expression instead defining key words for distributions (i.e. normal, rectangular, ...) and add a variable list of parameter with some separation character like normal,0.14,0.02
The key words for distributions can be taken from GUM documents:
GUM S1 - JCGM 101:2008 - Table 1
rectangular R(a, b)
curvlineartrapezoid CTrap(a, b, d)
trapezoidal Trap(a, b, β)
triangular T(a, b)
arcsine U(a, b) (sinusoidal - U shaped)
normal N(x, u2(x)) (Gaussian)
normal-multivaraite N(x,Ux)
normal-bivariate N(x,Ux)
gamma G(q+ 1,1) q = objects counted
exponential Ex(1/x)
t-distribution-1 tn−1( ̄x, s2/n)
t-distribution-2 tνeff(x,(Up/kp)2)
GUM S2 - JCGM 102:201
t-distribution-multivaraite tν( ̄x,S/n)
Parameters with degrees of freedom (DOF) in the examples above: ν, n-1, q
Parameters with geometrical shape of elements: d, β
All other parameters are already provided with the D-SI data "value", "uncertainty", "covarianceMatrix",...
The validation service from WP 3 could validate this statement.
The issue was reopend to continue to define semantics for stating distributions in the D-SI.
First, please notice the outline of distributions listed in GUM in one of the previous comments. This could be a starting point for the specification of the distributions.
What needs to be discussed is how to provide the parameters of distribution elements (i.e. DOFs). Basic parameters as for example expectation value and variance are already provided within the elements unit and uncertainty of the D-SI.
A basic question is, if the distribution should contain all information. Then it would be possible to use the distribution as a standalone element. Or if it is accaptable to relate to data provided by real, complex, etc.
In the first case, we may simply use existing formats from e.g. the R language. In the latter case, we come up with a new format most probably.
The element "expandedUnc" in the SI_Format contains the element "distribution", which is currently a string. Almost always the gaussian or rectangular distribution takes place in this area.
My suggestion is to insert a selection list / choice with the following items:
Normal
Rectangular
Triangular
U-Shaped
Step
other
This way, there is no misunderstanding when someone writes "gaussian" and someone else writes "normal", etc.
(Source for the names: Distribution for Uncertainty Estimation using the ISO Guide to the Expression of Uncertainty in Measurements)
Additionally, a string element can be added for "otherDistribution".
@CMueller-Schoell & @JHaller : Can we continue the discussion here? We had the impression, that the issue on degrees of freedom is a question regarding the provision of information on numerical distributions.
Please also see #33 regarding discussions about uncertainty dependencies and standard uncertainty.