Abstract
Keywords
Musicians must develop strong mental and psychomotor skills (Ericsson, Krampe, & Tesch-Römer, 1993; Palmer, 1997) in order to master their musical instruments. These specific skills differ by musical instrument (Clarke, Parncutt, Raekallio, & Sloboda, 1997). Wind instruments, for example, require skills such as air support and lip tension, while string instruments require skills such as bowing, string crossings and shifting (changing positions) (Wurtz, Mueri, & Wiesendanger, 2009). The performance of these skills in isolation can already be very difficult, and their combined performance requires coordination, which may (initially) lead to wrong notes (Alexander & Henry, 2012), unintended tempo changes and pauses. Zdzinski and Barnes (2002) identify five factors as valid and reliable measures for string instrument performance: interpretation/musical effect, articulation/tone, intonation, rhythm/tempo, and vibrato.
String instruments differ both technically and pedagogically from other instrument groups. Some important technical differences include (a) the coordination between the left and right hands, (b) the production of the melody based on interchangeable finger patterns instead of fixed finger buttons, keys, or frets, and (c) the production of sound by moving the bow in two directions on the strings. Whereas methods for woodwinds and brass usually start with long notes to support the production of a stable intonation through airflow, string instrument methods start with shorter notes in order to develop the techniques by which the melodies are bowed. Likewise, most wind instrumentalists start by learning note-specific finger buttons or keys on their instrument, whereas string instrumentalists focus on finger patterns and positions to produce combinations of notes. The position classification of Dotzauer and Klingenberg (1934) is commonly used for the cello. The starting point is the open string combination of A, D, G and C. The first position starts a whole tone higher on B, E, A and D, whereas the second position starts only a semitone higher than the first position on C, F, B
Sight-reading, an indispensable skill for (classically trained) musicians, is defined as “the ability to play music from a printed score or part for the first time without benefit of practice” (Wolf, 1976, p. 143). Both reading and psychomotor skills play an important role in this complicated process. As our ultimate goal is to determine how cellists can improve their sight-reading performance, we will mainly focus on the pedagogical perspective with teaching approaches and related factors that promote sight-reading.
Theories about sight-reading have a long history. Karpinski (2000) suggests that historical views on sight-reading focus on developing a mental picture of printed music. The romantic composer Robert Schumann (1848/1967) taught his piano students that playing music only by practicing psychomotor skills is insufficient. He advised them to make a mental representation of the melody and its underlying harmonies. The ability to think in music and musical images was also suggested by Seashore (1919) but should be considered “essentially anecdotal” (Brodsky, Kessler, Rubinstein, Ginsborg, & Henik, 2008, p. 428). It is more likely that musicians, particularly vocalists, benefit from an internal auditory representation of pitch and pitch relationships (Fine, Berry, & Rosner, 2006). According to Woody (2012), the key element in playing music is linking such representations to motor production. He refers to these representations as “goal images” that can originate from sight-reading (built from notation) or from “playing by ear” (stored mental image in memory). These goal images are taught during ear training lessons in which the use of elements such as pitch identification, chord identification, key identification, interval identification and practice of rhythm are encouraged. In this study, we will refer to these elements as “tonal approaches”. To our knowledge, specific elements of string instrument didactics, such as position identification, string identification, shifting and extensions, have never been considered in terms of conditional skills for sight-reading ability on string instruments. In this study, we will refer to these elements as “positional approaches”.
Musical sight-reading is inseparable from conventional training and from the careers of classically-educated musicians. A smooth interpretation of musical notation is required of both beginners and professionals in order for them to quickly, effectively and accurately rehearse and perform compositions. As it is almost impossible to thoroughly study large amounts of music in a short period of time, especially in an ensemble context, musicians regularly have to sight-read. There has been an ongoing debate on whether sight-reading can be taught or is a stable characteristic. Sloboda (1978) concluded that sight-reading indeed consists of several cognitive processes that can be learned. In addition, findings from a meta-analysis of Mishra (2014) support the idea of sight-reading as “a teachable activity rather than a stable characteristic and that sight-reading is a skill that improves with the musicality of the performer” (p. 461). Kopiez and Lee (2008) found a combination of teachable and non-teachable factors, including psychomotor speed, early acquired expertise, mental speed and auditory imagery, to be the best predictors of sight-reading ability. Kopiez, Galley, and Lee (2006) classify predictors of sight-reading ability into three groups: general cognitive skills (e.g., short-term memory and working memory), elementary cognitive skills (e.g., reaction time and speed of information processing), and practice-related skills (e.g., auditory imagery and expertise). Mishra (2014) concludes that improvisation, instrumental technique, age and ear training are factors that moderately correlate with sight-reading ability. Gromko (2004) and Wurtz et al. (2009) stress the importance of music reading comprehension, musical experience and perception, anticipation, short-term memory, and audiation as essential components of sight-reading. From a pedagogical perspective, only the teachable factors related to sight-reading are relevant for further consideration. To that end, Zhukov (2014) selects three approaches that are most promising for piano students: experience in accompanying, rhythm training and knowledge of musical style. Wristen (2005) stresses the importance of developing pedagogical methods by understanding cognitive processes related to sight-reading, whereas Thompson and Lehmann (2004) emphasize the importance of understanding the musical structure. Finally, Gudmundsdottir (2010) identifies rhythmic accuracy as a way to improve sight-reading.
Previous research on sight-reading has mainly adopted a tonal perspective, where intervals and harmonic functions form basic premises. Alexander and Henry (2012) also observe potential in the use of tonal approaches during sight-reading. Their study constitutes one of the few examples of research in sight-reading that focuses exclusively on string instruments. In their study, they use a modified version of the Vocal Sight-Reading Inventory (VSRI; Henry, 1999, 2001), which was developed for measuring the sight-reading level of vocalists and comprised “28 tonal patterns … defined as ascending and descending conjunct motion, skips and leaps within chordal elements, cadential patterns, modulatory patterns, and chromatic patterns” (Alexander & Henry, 2012, p. 203). According to Alexander and Henry (2012), the VSRI is a valid tool for assessing pitch sight-reading ability for string instrumentalists. However, the VSRI does consider the difficulty of simultaneous tasks, such as key (number of accidentals), string crossings, extensions of finger patterns and shifting.
A number of arguments support the notion that research focusing on a mere tonal approach to sight-reading on string instruments should be extended with research into position knowledge and the use of approaches while performing. Musicians might, for example, use mnemonics to remember accidentals and positions on their instrument. Apart from the knowledge and use of approaches, inner hearing might also play a role by correcting intonation after the tone has been produced. In contrast to vocalists, musicians’ production of tone requires several psychomotor movements that are defined by the technical features of the instrument, such as extensions and shifting distances. In addition, producing the right pitch on a string instrument relies heavily on shifting into an appropriate position and/or crossing strings at the right moment. It can be conjectured that string instrumentalists must consider theoretical knowledge, such as key, while performing a melody, whereas vocalists just produce tones by making adjustments in their larynx. We expect that both clumsy decisions and executions in shifting and lack of theoretical knowledge about extensions and accidentals lead to several pitch errors in sight-reading. Therefore, it is expected that having a more extensive mental position knowledge network will be associated with fewer pitch errors. A well-developed position knowledge network might also speed up the execution of shifting and recognition of extensions, thereby causing fewer fluency errors.
Research questions
In the context of sight-reading by cellists, the purpose of this study is to answer the following questions:
RQ1: What effect does key complexity have on sight-reading performance?
RQ2: Does better position knowledge lead to better sight-reading?
RQ3: Does the use of positional approaches yield better sight-reading than the use of tonal approaches?
RQ4: Are pitch and fluency errors in sight-reading predicted by the same combination of factors?
Method
Participants and design
The participants were 79 amateur cello students (52 female), aged 7 to 70 years (
Materials and measurement tools
Three versions of all measurement tools were developed, each differing only in the order of their questions. The measurement tools were randomly assigned to participants to avoid order effects.
Transposed VSRI
Four sight-reading melodies selected from Henry’s (1999, 2001) VSRI were adapted to measure participants’ sight-reading abilities based on pitch skills and fluency errors (see Appendix C). The melodies were selected based on the resulting positional complexity for cellists after transposing the melodies into other keys, which resulted in the Transposed VSRI (T-VSRI) (see Figure 1). The four original melodies consisted of keys with no more than two accidentals and were considered by the first author to be of a low complexity level. The moderate complexity level contained the same melody; however, the music was transposed to keys with three or four accidentals, while transpositions to keys containing five or six accidentals resulted in a high complexity level. The melody of the sight-reading tasks was unknown to participants, and because of the solo performance, any written harmonic context was absent.

Sample scores in low, moderate and high complexity levels of T-VSRI transposed from original melodies from VSRI by Henry (1999, 2001).
Position knowledge test
A position knowledge test (PKT) was developed by the first author and consisted of 40 multiple choice questions with four possible answers (see Figure 2). This test was based on Dotzauer’s and Klingenberg’s (1934) generally accepted didactic classification of positions on the cello in continental Europe. Rasch analyses (see Table 1) showed a strong item reliability (0.89) and acceptable outfit measures between 0.51 and 1.99 (Bond & Fox, 2007). Items were labeled per question type (easy fingerings, difficult fingerings and position/extension).

Example of an item from the Position Knowledge Test with four possible answers.
Rasch item measures with labels, infit and outfit values and PTMEA correlations
Furthermore, the inter-item reliability appeared to be very strong (α = .94). The PKT differentiated very well at all levels, except for the 14 highest scoring participants (Wolfs, 2016).
Survey of sight-reading approaches
A survey of sight-reading approaches was developed by the first author. Participants’ perceptions of their use of tonal (see Appendix A) and positional (see Appendix B) approaches while sight-reading were measured with this questionnaire. The survey contained 24 statements that were rated on a 5-point Likert scale and equally divided over the approach categories. An example of a positional statement is
Sight-reading errors
For reasons of accuracy, only errors were counted instead of correct notes and fluent transitions between tones. Wrong notes in performing the melodies of the T-VSRI were defined as produced tones that tonally could not be linked to the corresponding note in the sheet music in terms of intonation. Pauses not exceeding the length of a one quarter note rest in the chosen tempo were labeled “short”; otherwise, they were labeled “long”. The audio data from the T-VSRI were scored by the first author (a certified cello teacher) for pitch errors (wrong notes) and fluency errors (short and long pauses). Twenty percent of the audio files were also scored by another certified cello teacher, resulting in a value of Cohen’s kappa of .933 (
Procedure
Participants and their parents were asked for their approval to participate through a letter. As part of the research was conducted during a regular cello lesson, efforts were made to minimize the disruption of the structure of the lesson: the teacher led the lesson and implemented the sight-reading tasks after giving feedback on the music that the student had prepared and rehearsed. Participants were asked to fill out both the SSRA and the PKT at home and return them to their teacher within two weeks. The tests were then anonymized. The SSRA contained instructions with examples on how to fill out a Likert scale. The PKT contained comprehensive instructions on the subject of labeling positions, understanding fingerings and recognizing extensions. The teacher checked the returned tests for missing data. Before participants performed the T-VSRI melodies during the cello lesson, we gave participants brief oral instructions to (a) set their own tempo, (b) try to maintain that tempo throughout the melody, and (c) prepare themselves for a maximum of 30 seconds before starting to perform each melody. We digitally recorded every performance to score it at a later time. While the participants played the melodies, we observed their position use and gave a passing mark when they made at least five shifts. We collected background data, such as age and gender, at the beginning of each lesson.
Data analysis
One-way repeated-measures ANOVAs were conducted (RQ1) to test the effect of key complexity on sight-reading performance. Three paired
Results
RQ1: What effect does key complexity have on sight-reading performance?
Analyses of variance were conducted to test the effect of the complexity levels on both pitch errors and fluency errors (see Table 2). A large main effect of complexity on pitch errors was found,
Average number of pitch errors and fluency errors per complexity level (
Post hoc analyses using Bonferroni correction indicated significantly more pitch and fluency errors each time the level of complexity increased (
Compilation of the results of the post hoc comparisons of levels of complexity with an average difference in pitch and fluency errors (
RQ2: Does better position knowledge lead to better sight-reading?
Linear regressions were calculated to predict the number of pitch and fluency errors based on the amount of position knowledge. A significant regression equation on pitch was found,
A MANOVA was conducted to test the difference in the number of pitch errors and fluency errors based on the amount of position knowledge between participants with a low (0–20), a moderate (21–30) and a high (31–40) position knowledge score. The results suggested a significant difference between participants with a low, moderate and high amount of position knowledge, Wilks’s Λ = .298,
Average number of pitch and fluency errors per class of amount of position knowledge
RQ3: Does the use of positional approaches yield better sight-reading than the use of tonal approaches?
Participants’ use of tonal and positional approaches was analyzed using Pearson product-moment correlation coefficients (see Table 5). A Fisher’s
Correlations between tonal and positional approaches, position knowledge, pitch and fluency errors
RQ4: Are pitch and fluency errors in sight-reading predicted by the same combination of factors?
Multiple regression analyses were performed to predict the number of pitch errors as well as short and long pauses. A significant regression equation was found for pitch errors,
Predictors of pitch errors, short and long pauses in multiple regression models (the proportion of explained variance per model and the standardized regression coefficients per variable)
Another multiple linear regression was calculated to predict the number of short pauses. A significant regression equation was found,
The roles of the variables of “age” and, to some extent, “experience” deserve a closer look. As shown in Table 6, the explained variance of these predictor variables declines sharply in Step 4 when technical level enters the model. This decline in the explained variance applies even more so in the model of pitch errors. Adding the use of tonal and positional approaches to the model (Step 2) increases the percentage of explained variance for all dependent variables; however, the use of tonal approaches significantly contributes only to the models of fluency errors, whereas the use of positional approaches significantly contributes only to the model of pitch errors. Position knowledge in particular increases the explained variance of pitch errors (Step 3) but seems to replace the explained variance of the use of positional approaches. Finally, adding technical level increases only the explained variance of the pitch error model, somewhat at the expense of positional knowledge.
Discussion and conclusion
On the subject of key complexity in sight-reading performance by cellists (RQ1), we observed a very strong effect that resulted in a significant difference in the average number of pitch and fluency errors between keys with one or two accidentals, three or four accidentals and five or six accidentals. These results are in line with findings from previous research, which showed that short-term memory (Gromko, 2004) and knowledge of music theory (Gudmundsdottir, 2010) should be considered important elements of sight-reading. The degree of theoretical music knowledge (provided in cello methods or by the teacher) and the degree of experience with accidentals and matching positions might have caused these differences. After all, the more accidentals that appear in a melody, the more uncommon (extended) position shifts are required. A third element that could have influenced the results was the nature of the melodies themselves: without the support of a harmonic context, the unknown melody could be hard for participants to predict. This could also impede participants’ judgment on their sight-reading task through aural reflection on pitch and intonation.
The participant’s degree of position knowledge indeed led to better sight-reading results (RQ2). It proved to be a very strong predictor of the number of pitch errors. This finding is consistent with previous research that showed that reading comprehension and studying music theory contribute to better sight-reading (Gudmundsdottir, 2010; Kopiez & Lee, 2008; Mishra, 2014; Sloboda, 1978; Wristen, 2005). Position knowledge, however, was only a weak predictor of fluency errors. It therefore appears that gaps in the position network of participants did not cause the majority of fluency errors. Perhaps poor goal images are responsible for the failure of a correct link to psychomotor production. RQ3 investigated whether using positional approaches yields better sight-reading results as opposed to tonal approaches. No significant difference in correlations exists between positional or tonal approaches relative to sight-reading errors. Our results indicate that the use of both approaches prevented sight-reading errors equally and seemed to not be mutually exclusive. All participants indicated that they used both approaches while sight-reading.
Most findings in this study were in line with expectations, such as those based on the literature on sight-reading and pedagogical-didactic insights. There are, however, some other results that are considered in more detail. A first finding concerns the different roles of predictor variables concerning a combination of factors that predict pitch and fluency errors (RQ4). There are indications that pitch and fluency are partially explained by different factors. The degree of position knowledge and technical level were the best predictors of the number of pitch errors. While use of positional approaches was an explanatory variable at first, it disappeared from the regression model in favor of positional knowledge. Although positional knowledge and approaches are not operationalized from the same construct, their relation is obvious: it is likely that participants who have greater positional knowledge are also more inclined to use positional approaches. Experience was found to be a moderate predictor of pitch errors, but this variable disappeared from the regression model in favor of position knowledge and technical level. The explanation for this shift might be that participants who had achieved a higher technical level over years of study also had more general experience. However, this shift did not occur in the models that predicted fluency errors: experience and the use of tonal approaches proved to be the best predictors of fluency errors. There are indications that pitch and fluency are partially explained by different factors.
These findings raise a few questions that deserve a closer look. The first question is why participants with a decent technical level and a good understanding of the sight-reading score still struggle with fluency errors. The combined results suggest the following assumption: to prevent fluency errors during sight-reading, technical and theoretical insights in the form of position knowledge and/or the use of positional approaches only are insufficient. This assumption is supported by the low number of pitch errors in this group compared with the still high number of fluency errors on all complexity levels (see Table 4).
The number of pitch errors increased rapidly when participants were confronted with more accidentals, as shown by the difference scores. This increase does not apply to fluency errors. Since position knowledge is only weakly correlated with fluency errors, we suspect that the faltering of participants cannot be mainly attributed to gaps in positional knowledge. Fluency errors, however, could also have been the result of poor goal images, resulting in inadequate psychomotor execution.
The use of positional approaches as a predictor of pitch errors disappeared from the regression model in favor of position knowledge and technical level, as indicated by the results. A second question is therefore whether tonal knowledge and skill would also eliminate the predictive power of experience and/or the use of tonal approaches regarding fluency errors. Tonal knowledge in this context can be defined as having insights into harmony. Intervals as well as tonal skills can refer to the aural imagination of the staff and the ability to correctly hear or sing a written melody internally while sight-reading. From this perspective, participants should possess an aural imagination of the staff. Opinions vary on whether human beings are able to mentally see or hear tonal functions in printed music. Drai-Zerbib, Baccino, and Bigand (2011) support the assumption of “audiation”, whereas Brodsky et al. (2008) deny its existence. The findings in this study, however, appear to confirm the role of a tonal consciousness. This assumption is consistent with Dowling’s (2014) description of a “tonal framework” in which he makes a comparison between the spoken language and a tune being played. Educated human beings interpret the content of a sentence as an integrated whole rather than as a series of single words. From this perspective, a melody is not merely a series of single notes but a rhythmic-melodic chain of tonal functions. The observed effect of tonal approaches on fluency errors is consistent with this perspective, but since this study did not take tonal knowledge and tonal skill into consideration, future research might provide us with answers. Meta-analyses from Mishra (2014, 2016) do, however, confirm the importance of aural training and solfège – both closely linked to tonal knowledge and skill – for improvement in rhythmical and melodic sight-reading.
A final question is why the use of tonal approaches seems to play no meaningful role in preventing pitch errors. If producing a melody on a string instrument is seen as first being able to technically produce the right pitches followed by fluently chaining those pitches while taking into account the rhythm, the current results of this study suggest that for the technical part of producing music, only positional knowledge and technical skills are required. Our expectation is that in order to play a melody in a “musical” way, tonal knowledge and tonal approaches might also be required. Future research might reveal whether Schumann (1810–1856), one of Europe’s most famous romantic composers, was right in claiming that technical skill is not sufficient to produce music.
Schumann (n.d.) claimed the following: It is not only necessary that you should be able to play your pieces on the instrument, but you should also be able to hum the air without the piano. Strengthen your imagination so, that you may not only retain the melody of a composition, but even the harmony which belongs to it. (p. 3)
The findings in this study contribute to existing cognitive theories on musical performance and seek to promote the use of different approaches to improve sight-reading by string instrument players in music education. What we do know is that the right pitch can be produced with a high technical level and extensive positional knowledge. Tonal knowledge, skill and approaches might, however, form the key to fluent sight-reading performance.
Future research on sight-reading has to be conducted to determine the role – if any – of tonal knowledge and skill as well as the use of tonal approaches regarding sight-reading. Which aspects other than experience and the use of tonal approaches are associated with fluency errors or prevent them, and is there a difference in the causes of short and long pauses? Furthermore, the cause of fluency errors should be looked at in more detail. In addition, can these findings in the future lead to a theory on musical performance that is broader than the string instrument family alone?
This study had several limitations that possibly affected the results. First, participants had to perform melodies without any given fingerings, which is unusual in an educational setting. Second, sight-reading was limited to pitch and fluency in this study, but sight-reading includes many more elements, such as rhythm, tempo, timing, phrasing, intonation, dynamics and musical styles.
From an educational point of view, we advise teachers to teach students music theory, such as reading notes, naming positions, and recognizing extensions and accidentals in order to develop a comprehensive and flexible mental “topographic map of positions” on the cello. The stimulation of the development of tonal knowledge and tonal approaches through ear training is also recommended. This study indicates that proper sight-reading is highly dependent on knowledge (stored in our minds) and motor skills (executed with our hands). Perhaps tonal competencies can be considered the missing link between playing notes and playing music straight from the heart.
