Abstract
Introduction
Polycystic ovary syndrome (PCOS) is a common endocrinological condition which is found to be prevalent in 5–10% of women in the reproductive age group 1 and can vary between subpopulations. It was in 1935 that Stein and Leventhal first described this unique gynaecological condition. They described this syndrome based on seven women who had a combination of obesity, hirsutism, amenorrhoea and bilateral enlarged polycystic ovaries.2,3 Since its first characterisation, there has been a vast evolvement in our understanding of its etiology, diagnosis and treatment. In the early 1990s, the National Institutes of Health (NIH) first proposed a diagnostic criterion for PCOS. It defined PCOS as a combination of oligo-anovulation and androgen excess after ruling out all other reasons for anovulatory infertility. 4 Ultrasound criteria for polycystic ovaries were considered ‘suggestive’ but not diagnostic of PCOS. These criteria were based on consensus expert opinion rather than the evidence from clinical trials. In 2003, the European Society for Human Reproduction and Embryology (ESHRE) and the American Society for Reproductive Medicine (ASRM) amended the NIH criteria and included ultrasound features of polycystic ovaries as the third diagnostic criteria. 5 The diagnosis of PCOS was established if a woman met two criteria out of the three. This was based on the observation that polycystic ovaries were consistent in women with biochemical and clinical evidence of the syndrome. Women with PCOS diagnosed by the Rotterdam Criteria can be further divided into four phenotypes, A to D, based on the criteria qualifying for diagnosis. In 2006, the Androgen Excess Society (AES) reviewed existing data and concluded that the original criteria set by NIH in 1990s can be accepted along with some modifications based on the expert opinion at the 2003 Rotterdam conference. In addition to the above classifications, Aziz and colleagues, 6 in an AES guideline, suggested nine phenotypes incorporating the criteria of anovulation/oligo-ovulation, clinical hirsutism, biochemical hyperandrogenism and ultrasound feature of polycystic ovaries. The Rotterdam criteria were endorsed by the global PCOS guideline published in 2018. The various diagnostic criteria are summarised in Table 1.
Diagnostic criteria for PCOS.
AES, Androgen Excess Society; ASRM, American Society for Reproductive Medicine; ESHRE, European Society for Human Reproduction and Embryology; NIH, National Institute of Health; PCOS, polycystic ovary syndrome.
Despite the consensus diagnostic criteria being in common use globally, there still exist several controversies in the diagnosis. This article aims to summarise the various controversies arising in the diagnosis of PCOS.
Irregular cycles and ovarian dysfunction
It has been observed that almost three quarters of patients diagnosed with PCOS have abnormal menstruation. 7 In adults, a regular cycle indicates any woman who is menstruating between 24 and 35 days, 8 which suggests an ovulatory cycle. Irregular cycles, which reflect ovarian dysfunction and oligo-anovulation, form one of the essential features of the Rotterdam criteria.
Serum progesterone levels can be measured by radioimmunoassay in the mid-luteal phase to confirm ovulation. In women with irregular cycles, the test might need to be measured later in the cycle depending on the length of the cycle. The test might also need to be measured every 7 days till menstruation commences. If the menstrual cycles are very irregular, measuring serum progesterone is futile as diagnosis is based on clinical features and ovulation induction therapy is indicated. 9 The lower limit of serum progesterone level to confirm ovulation ranges from 16 to 28 nmol/L.10–13
The greatest controversy hovering around this diagnostic criterion remains in defining ‘irregular cycles’ during pubertal transition. The accurate diagnosis is still challenging due to very limited evidence on this subject. During pubertal transition, distinguishing irregular cycles due to reproductive immaturity from those due to PCOS is quite challenging due to lack of a clear definition. In the first 2–3 years following menarche, the cycles may be irregular and most cycles range between 21 and 45 days. 14 Studies show that the lower limit of a normal cycle is 21 days. The upper limit is somewhat variable at 40–45 days.15–17 Cycles more than 90 days represent 95th percentile for length, and this should warrant further assessment even if seen in the first gynaecological year where gynaecological age is conceptualised as number of years after menarche. Three years after menarche, most cycles are like adults.15,18
It has been concluded that frequency of ovulatory cycles is related to time since menarche as well as age of menarche. Earlier does a girl attain her menarche, earlier are her ovulatory cycles. When the age of menarche is less than 12 years, half of the cycles are ovulatory in the first gynaecologic year, and almost all are ovulatory by their fifth gynaecologic year. In contrast, it was noted that in girls with late onset of menarche, it could take up to 8–12 years for all cycles to be ovulatory. 18
Recent data suggest that in the first-year post menarche, about half of the cycles are anovulatory. Most of them occur in a range of 21–45 days, lasting 2–7 days. Two years post menarche, 80% of irregular cycles tends to be ovulatory. By the third-year post menarche, 95% of cycles are regular and ovulatory and it is in these 5% of girls with irregular cycles that PCOS should be considered.19–22
The ESHRE international evidence-based guidelines recommendations around irregular cycles in adolescents are based on paediatric consensus opinion. This consensus recommends that if an adolescent girl has irregular periods (<21 days or >35 days) even after 3 years of menarche, she should be assessed for PCOS. 22
If a girl has irregular cycles with many months of period-free intervals, particularly if she has signs of hyperandrogenism in the first few gynaecologic years, she should be assessed to rule out PCOS rather than reassuring it to be a normal phase of pubertal transition.20,23
Many adolescents may be considered or may already be on oral contraceptive pill (OCP) treatment. The ESHRE international guidelines recommend assessment of menstrual cycle patterns as well as assessment of clinical and biochemical hyperandrogenism before commencing OCP therapy in adolescents with irregular cycles after 1 year of menarche. If baseline assessment is abnormal, they must be explained about the risk of PCOS and might warrant further reassessment. If a baseline assessment has not been undertaken, it may be appropriate to stop the pills for 3 months and then assess to rule out PCOS. 22
To summarise, it is still unclear when adolescent menstrual irregularity becomes pathophysiological. In adolescents, setting a clear-cut boundary between PCOS and normal physiological immaturity of hypothalamic pituitary axis is controversial. This could in turn potentially lead to over- or underdiagnosis which could lead to change in overall prevalence of the condition. Identifying the natural course of PCOS in young girls as well as early predictors by further longitudinal studies will allow appropriate and timely diagnosis.
Biochemical hyperandrogenism
It has been found that more than three quarters of women who have PCOS have increased circulating androgen levels. 6 PCOS can be picked up by assessing biochemical hyperandrogenism especially in women who lack the signs of hirsutism or have unclear signs of hirsutism. Controversy arises on which androgens to measure, defining normal ranges for these, which assays to use, cost factors, access to high-quality tests and also overlap between control and PCOS patients.
What androgens to measure
The androgens that are often measured include total testosterone (TT), free testosterone (FT), calculated bioavailable testosterone (BA-T), calculated free testosterone (calculated FT) using the formula of Vermeulen and colleagues, 24 free androgen index [FAI = 100 × (total testosterone/sex hormone-binding globulin [SHBG])], dehydroepiandrosterone sulphate (DHEAS) and androstenedione.
The use of TT can identify 20–30% of PCOS women as hyperandrogenemic. However, only 1–3% of testosterone is unbound to plasma proteins, thus raising concerns whether TT or FT is the most clinically useful measure. FT can identify 50–60% of such women with PCOS as hyperandrogenemic. The levels of SHBG are reduced in women with PCOS resulting in a further increase in FT. Androstenedione can be mild to moderately elevated in PCOS. Androstenedione can be elevated in PCOS, but marked rise indicates adrenal pathology especially 21-OH deficient nonclassic congenital adrenal hyperplasia (CAH). Elevated levels of 17 hydroxyprogesterone can indicate nonclassic CAH, 21-hydroxylase deficiency type, which is a milder and late onset form of CAH.
Another useful measure is FAI, which is the ratio of TT to SHBG multiplied by 100 and hence its measurement requires accurate measurement of testosterone and SHBG. FAI results can be biased by inaccuracies in measurements of testosterone and SHBG. 25 Studies show an acceptable correlation between FAI and FT. 25
Hahn and colleagues
26
included 133 untreated PCOS patients and 54 healthy control women and measured androgens which included TT and SHBG, luteinizing hormone (LH), follicle-stimulating hormone (FSH), androstenedione, DHEAS and albumin. They concluded that PCOS patients had a significantly higher levels of androgens responsible for biochemical hyperandrogenism than controls (all
Measuring TT comes with their own limitations. TT values show variation depending on the time of the day they were taken and that many similar steroids present in the circulation tend to interfere with the assay. Also, age- and gender-corrected reference values are lacking and there is no universally accepted testosterone calibrating standard. 25
Based on all the evidence, ESHRE PCOS guideline group 2018 22 concluded that though there is inadequate evidence to recommend which androgens to measure, they suggest that FT provides the best accuracy to detect biochemical hyperandrogenism. The other hormones that could be tested are TT, DHEAS and androstenedione. DHEAS and androstenedione on its own do not provide additional information regarding hyperandrogenemia in PCOS.
Many women are on OCPs when they are seen in the clinics. Measuring these hormones cannot reliably assess hyperandrogenism as inherent increase in SHBG and reduction in gonadotropin-dependent androgen production due to medication effect. Hence before testing these hormones, OCPs should be discontinued for at least 3 months.
Hormonal assays
The assays that are used to measure androgens to diagnose PCOS include liquid chromatography–tandem mass spectrometry (LC-MS/MS), gas chromatographic mass spectrometry (GCMS), radioimmunoassay (RIA), chemiluminescence immunoassays (CLIA) and enzyme-linked immunosorbent assay (ELISA).
Based on the studies and data available, measuring FT provides the best ability to detect biochemical hyperandrogenism. Unfortunately, direct assays to measure FT are not entirely reliable. Direct assays like RIA, ELISA and CLIA are technically simple, relatively inexpensive and can be automated. These measure TT. The testosterone measurement using these assays is designed for males where the levels are higher compared with lower levels in women. Hence their accuracy is limited at testosterone levels <300 ng/dL. 28 Also, testosterone levels are overestimated with these assays, they are not standardised, are of insufficient precision and also show poor sensitivity. RIA and chromatography is widely used with well-documented reference values among different populations and has better sensitivity than CLIA and ELISA but is rather labour intensive, costly and time consuming.
MS after extraction and liquid (LC) or gas chromatography (GC) are highly accurate when validated properly but are expensive and current standardisation is still lacking. 25 High-quality assays (LC-MS/MS and extraction/chromatography immunoassays) to assess total or unbound (free) testosterone provide the best possible accuracy. 22 Many of these assays have their limitations in that the reference ranges in different laboratories vary widely and are often based on an arbitrary percentile or variances of the mean of the values observed in a population.
Clinical hyperandrogenism
Mild-to-moderate androgen excess is represented by hirsutism, alopecia and acne. Women with PCOS can present with one or more signs of hyperandrogenism.
Hirsutism
Hirsutism is described as terminal hair in male-like pattern in women. An overestimation of hirsutism can easily be made if body and facial vellus hair is wrongly perceived as terminal hair. Terminal hair is distinctive as they can grow beyond 5 mm. It is also important to bear in mind that different ethnic groups can have denser vellus hair and hence hirsutism can be overestimated. Confirming hirsutism can also be challenging as many women have often treated excess hair growth before presenting to the clinics.
One of the methods that is most widely used to assess hirsutism and grade its severity is the modified Ferriman–Gallwey (mFG) score. It includes assessment of nine body areas, and each area is visually scored from 0 to 4. A pictorial representation is then made.
The biggest controversy is defining a ‘cut-off’ value for the mFG score to diagnose hirsutism. There is a strong difference in the prevalence and severity of hirsutism in different ethnic groups. Although this is known, unfortunately, there appears to be little difference in the cut-off values for determining excess terminal facial and body hair as abnormal (i.e. defining ‘hirsutism’). The mFG cut-off score can be based on percentile with a score >6–8 consistent with the 95th percentile of unselected women.29–31 It can also be defined by a lower percentile (85th–90th percentile) or by cluster analysis where the score is analysed in relation to other features of PCOS. Many studies have concluded that a lower mFG score 32 for black and white women compared with Asian women 33 represents true abnormality. Thus, generalising a cut-off at the 95th percentile is not appropriate and hence the ESHRE PCOS guideline 2018 recommended the cut-off of ⩾4–6 on mFG score. Overall, more than half of women with mFG scores of 3–5 34 and almost three fourths (>70–90%) of women with scores >529,35 have elevated androgens or PCOS.
Acne
Acne can be associated with biochemical hyperandrogenism. Unlike hirsutism which offers a good predictive value for hyperandrogenism, the predictive value of acne is still unclear. There are not many studies looking at predictive value of acne, most of them being retrospective.36,37 Unlike hirsutism, we lack an accepted scoring system to clinically evaluate and measure acne. Overall, while acne in women might indicate androgen excess, the predictive value of acne alone for hyperandrogenism remains unclear.
Male pattern hair loss
Diffuse sagittal alopecia can be seen in women with PCOS. The Ludwig scale, 38 with a range from grade I to grade III, indicates increasing severity and can be used to visually assess scalp hair loss. On its own, its predictive value as a marker of androgen excess is unclear.
Ultrasound and polycystic ovarian morphology
Ovarian follicles undergo a continuous process of recruitment and apoptosis through reproductive life. This starts in foetal life, continues throughout childhood and adulthood and stops at menopause. OV changes over time with changes in antral follicles and stromal development. Most women with clinical and endocrinological features of PCOS demonstrated ovaries which are polycystic on ultrasound; hence this feature was added as a third inclusion criterion in 2003. 39
Various ultrasound features have been identified as a feature in women with PCOS which include antral follicle count (AFC), follicular number per ovary (FNPO), OV, ovarian area (OA), ovarian blood flow and ratio of stroma to total ovarian size.
Antral follicle count
AFC is considered to be a good measure to identify the severity of reproductive dysfunction in PCOS. Increased AFC was most significantly associated with increased androgens and LH:FSH ratio. 40 The polycystic ovaries in PCOS can be confused with other causes of multifollicular ovaries which could be both physiological (puberty) and pathological (hypothalamic anovulation, hyperprolactinaemia, central precocious puberty). As multifollicular ovaries are considered physiological in the early years of reproductive life (within 8 years of menarche), diagnosing PCOS based on ultrasound criteria in adolescents is not appropriate.
Earlier studies showed greatest sensitivity for defining polycystic ovaries from a count of 10 or more follicles arranged peripherally around dense core stoma (Adams criteria) 41 while a definition of 12 or more follicles 42 offered a greater specificity. Most of authors had initially set this threshold at 1043,44 but some authors recommended 15. 45 Following a consensus opinion in 2003, the count was then changed over to >12 follicles measuring 2–9 mm in diameter.
With advances in ultrasound and better transducer frequencies, significant increase in FNPO was reported with transducer frequency of >8 MHz. The previous threshold for FNPO of 12 or more resulted in a significantly greater prevalence of polycystic ovarian morphology (PCOM) in women especially below 30 years of age in a general population.46–49 Eleven studies which included 2961 participants looked at FNPO and concluded optimal sensitivity and specificity with >19 per ovary. The grid system method to count antral follicle, which was devised by Lujan and colleagues in 2010, is the most reproducible technique. This technique showed a sensitivity of 85% and specificity of 94% when 26 follicles per ovary was taken as a cut-off. This cut-off may miss mild form of PCOS. Others 50 suggested lower threshold of 19 follicles.
Counting antral follicles and setting a threshold is controversial, especially, as different populations and counting techniques may account for differences. There are also differences in the method of counting the follicles, observer variability in assessing follicle number and variable ultrasound technology.
When diagnosing PCOM on transvaginal scan, ESHRE PCOS guideline group 2018 have suggested to use a cut-off of FNPO of 20 or more in one or both the ovaries or OV >10 mL without inclusion of dominant follicle or corpus luteum or any cysts. This cut-off is to be used when using a transvaginal scan with a frequency band of >8 MHz. The cut-off of FNPO of 12 or more or OV of >10 mL should be used when the ultrasound machine of older technology is used.
Few studies show that using 3D scan to provide automatic volume calculations of antral follicles (e.g. VOCAL™ and SonoAVC™) showed better accuracy51–53 as well as reduced interobserver variation in follicle counts54,55 compared with manual 2D measurements. Not many studies have attempted to look into the reliability of 3D ultrasound to estimate the follicular population in polycystic ovaries.56–62 More studies are necessary to confirm its importance before its recommendation for routine practice.
Ovarian size
The size of the ovary changes through the reproductive life of a woman with slow decline during adulthood and rapid shrinkage during menopause. Ovary achieves its maximal size during adolescence. As only small changes in OV occur between the age of 20 and 39 years, an age-specific OV cut-off is not warranted.
Histopathological studies show that stromal hypertrophy as well as increased follicular count which reflects PCOM correlates to OV and OA. Rotterdam criteria have set a threshold of OV >10 mL for diagnosis of PCOM. 63 Women with PCOS have a higher ovarian size compared with normal women who are matched for age and body weight. 63 – 65 There has been several interests in setting a lower cut-off volume for OV including 6.4, 66 6.7, 67 7.068,69 and 7.5 mL. 64 These different cut-offs could be because of variation in population characteristics.
OV cut-off of >10 mL was solely based on the results from various studies, where the upper limit was defined as either being maximum value of controls or 95th percentile of control range. The currently accepted cut-off of >10 cm3 has a sensitivity of 98.2% and a lower sensitivity of 45%, in diagnosing PCOM. 42 It should be noted that OV measurement still holds its place when the image resolution does not allow an accurate AFC.
Other Ultrasound measurements
Ovarian stroma
Only few studies have looked at ovarian stroma as a diagnostic tool for polycystic ovaries. Fulghesu and colleagues 70 proposed a cut-off of 0.32 for ratio of stroma to total ovarian size. They suggested that this cut-off value is associated with hyperandrogenaemia. It appears that stromal volume and ovarian size correlate well and hence adding stromal volume to clinical practice does not provide much value.
Ovarian blood flow
To date, there are no homogeneous data confirming the importance of measuring ovarian blood flow to diagnose PCOM. There is also no cut-off values that have been proposed to differentiate PCOM from normal ovaries.
Increased serum AMH concentrations as a marker of PCOM
AMH is a glycoprotein which is produced by granulosa cells of pre-antral and antral follicles of the ovary. As serum AMH reflects the antral follicles, a significantly higher level of AMH is seen in women with PCOS compared with normal women.71,72 There is also a direct correlation between AMH levels and ultrasound parameters of FNPO and OV.69,73–78
Dewailly and colleagues proposed a new adaptation to diagnose PCOS based on FNPO and serum AMH levels. They concluded that measuring serum AMH and its elevated levels in PCOS and PCOM could be much more reproducible than FNPO which can show interobserver variation as well as variation from unit to unit. 69 This is if a universally available assay is used. They proposed a classification which takes into consideration previous classification for the diagnosis of PCOS. They used FNPO > 19 and AMH > 35 pmol/L as surrogates when either oligo/anovulation or androgen excess was missing. 69 This classification is shown in Table 2.
Diagnosis of PCOS incorporating serum AMH.
AMH, anti-Mullerian hormone; FNPO, follicular number per ovary; PCOM, polycystic ovarian morphology; PCOS, polycystic ovary syndrome.
Overall, 26 different studies looked at AUC-ROC of AMH for the diagnosis of PCOS. Of these studies, 21 were in adults and they showed it to be 0.66–0.994 where threshold ranges from 10 to 57 pmol/L 80 ; 5 studies were in adolescents where AUC-ROC for diagnosis was 0.5–0.88 where threshold ranges from 25 to 44 pmol/L. 79 There is a wide fluctuation in the levels of AMH although we know that AMH is significantly higher in women who have PCOS. This could be due to the variation in AMH assays that are used and variation in the population and the phenotype of women that were studied.
Many studies that looked at correlation between AMH levels and diagnosis of PCOS used Diagnostic Systems Laboratories (DSL) or Immunotech (IOT) assays. These assays are not available anymore. Assays which used Gen II kit which have been used more recently also need cautious interpretation. The recent new automated assays that are used have very little data on them. 80 It is also worth noting that International Federation of Clinical Chemistry does not provide a standard regarding assay methods. In view of all these caveats, serum AMH value as a surrogate to diagnose PCOM is still not accepted. AMH can be a potential surrogate marker for diagnosing PCOM in the future, provided further research confirms its validation in vast population of different backgrounds.
Conclusion
PCOS still remains a controversial topic due to its varied etiology and undetermined phenotypic spectrum. The existing diagnostic criteria are those suggested by the NIH in 1990, Rotterdam criteria 2003 and AES criteria 2005. Expanding the diagnostic criteria in 2003 was aimed at targeting the different phenotypes that exist. In 2005, the AES task force accepted the original 1990 NIH criteria along with modifications, considering the 2003 Rotterdam criteria. Despite being in widespread global use, each diagnostic criterion stems an unresolved controversy as much of the evidence is only based on consensus opinion rather than robust evidence. Although there is a clear cut-off for ‘irregular cycles’ in adults, defining ‘irregular cycles’ in adolescents poses a great controversy. Further longitudinal studies are needed to look at natural history of PCOS and also early predictors in adolescents. Assessing hyperandrogenism clinically is highly subjective and further studies are needed to determine the cut-off values for mFG scoring system. There is insufficient evidence regarding the best method to use for measurement of androgens. Also, the methods used to measure are of insufficient precision. FNPO, which forms one of the diagnostic criteria, has been well researched. Eleven studies including 2961 participants concluded that an optimal sensitivity and specificity was achieved when a cut-off of >19 follicles was used. With OV, 12 studies with 2096 participants did not provide a clear cut-off for the optimal OV with both 5–8 cm3 and 9–10 cm3. There is inadequate evidence for the use of other ultrasound parameters to diagnose PCOS. The use of AMH as a substitute for diagnosis of PCOS is hindered by the fact that current assays need improved standardisation. The evidence also does not adequately support the role of AMH currently. In conclusion, large scientific and clinical research is needed in this field.
