PSYCHOLOGY AT A GLANCE by Susan R Lancaster: 06/15/13

Saturday, June 15, 2013

UNDERSTANDING RELIABILITY & VALIDITY OF A TEST

There are many times in today’s society when psychological tests may be administered in order to measure certain mental and/or behavioral characteristics. However, before administering a psychological test, it is essential to try and choose the most appropriate testing instrument to collect data. This is because when inappropriate test methods are used, the final results can also acquire and/or measure inadequate levels of overall reliability and/or validity.

Since, this can occur my ultimate goal for this paper will be to provide a better understanding of test reliability and validity by addressing certain aspects that are directly associated with these concepts. This will include first discussing what the reliability of a test is, why it is important, and how the two main types which are test-retest reliability and internal consistency reliability can be measured. I will then address what the validity of a test is, why it is important, and how the four main forms which include face validity, content validity, criterion-related validity and construct validity can also be measured.

The next section will identify certain things that psychologists need to do to ensure that selected test methods will measure adequate levels of reliability and validity. Three of these include gathering personal information about each participant, choosing well-established test methods, and addressing any ethical, legal, individual and socio-cultural issues that may arise. Some specific ethical considerations that will also be addressed include confidentiality, cross cultured sensitivity, informed consent, protection from harm, and test administration.

Furthermore, the final thing that I will discuss in this paper is why all four forms of validity must be considered and applied when using test methods in different types of settings. This is because the validity of a measure can change due to various factors within each setting. One example of this could be if a psychologist decides to use an intelligence test that is designed for adults with elementary school students. In this case, the level of overall validity may be lacking because young children don’t normally possess the same level of intelligence as adults. Therefore, the test method in question would also not be appropriate to use with both age groups.

Reliability of a Test

Reliability can be defined as the degree of which scores from a specific test are consistent and free from errors of measurement. Some possible factors that can cause errors of measurement are the test environment, poorly worded questions, nervous test takers, or unclear instructions from the examiner. There are also two types of reliability that can be measured. The first type is test-retest reliability which is based on how consistent participant responses are over time and it can be measured by administering the same test at different times. While, the second type of reliability is internal consistency and this is based on how consistently the same construct is measured after administering only one test and then calculating the average of all correlations among items (Zechmeister, Zechmeister & Shaughnessy. 2001).

Determining test-retest reliability and internal consistency reliability is important because it can confirm whether a specific test is consistent over time and measures the construct that it was initially designed to measure. This data is also essential because it allows psychologists the opportunity to confirm which testing methods may be most reliable and appropriate to use with participants and/or specified need. If both types of reliability cannot be measured than it is also impossible to confirm validity and therefore, the test method in question should not be used.

Validity of a Test

Validity can be defined as the degree of how valid the scores from a specific test are when measuring what it is intended to measure. There are also different observational forms of validity that must be addressed when completing a validation process. The first form is called face validity and this is used to determine if a test appears to measure criterion within a specific domain. However, in many areas of psychology, this may not be considered a true form of validity because it is not certain that the appearance of test items is an accurate representation of the intended domain (Neukrug, Fawcett. 2010). Therefore, this is considered a basic observational type of validation that is used to measure the validity of a test at face value only. One example of how face validity can be measured is if a psychologist designs a test to assess mathematical skill. He or she could then request feedback from laypeople to determine if they agree that the test may actually measure mathematical skill based on its appearance.

Providing evidence of face validity is also important because no formal testing instrument can be accepted in the field or used in future research and design without it. However, there are instances when informal assessment tools that lack face validity may still be used. One example of this is if a psychologist designs an online survey that actually initiates sales of self-help products versus its stated purpose of simply collecting consumer data.

The second and simplest form of validity is known as content validity and this is used to measure “how adequately a test samples behavior representative of the universe of behavior that the test was designed to sample” (Cohen, Swerdlik. 2010. p. 176). This is similar to face validity but it confirms whether a test actually measures criterion within a specific domain instead of just assuming that it does. One example of this is if a psychologist has mathematical experts confirm that a test used to assess mathematical skill actually does measure that domain.

Confirming content validity is also important because it gives psychologists the opportunity to determine which test instruments measure an adequate level of content validity versus those that do not. Furthermore, if a test instrument does not have an adequate level of content validity then a more valid method should be used. This is because it can ensure that the collected data may have a higher level of overall validity, accuracy and truthfulness.

A third form of validity is called criterion-related validity and this can determine if a test method produces similar results when compared to valid established instruments that measure the same variable. On example of criterion-related validity could be if employee selection tests are validated against measures of a criterion like job performance. There are also two types of criterion validity which are predictive and concurrent validity. Predictive validity is based on how well an individual's performance is predicted for a future measure and concurrent validity is based on how test methods compare to similar instruments that measure the same criterion.

Confirming evidence of criterion-related validity is also important because it gives psychologists the opportunity to predict measures with future participants and determine which instruments measure an adequate level of criterion validity when compared to valid established tests. If a test instrument does not have an adequate level of criterion validity then a more valid method should be used. This is because it can ensure that collected data will closely reflect the results that are measured when using valid more established test methods.

A fourth form is known as construct validity and according to (Bordens, Abbott. 2008), this type of “validity applies when a test is designed to measure a "construct" or variable "constructed" to describe or explain behavior on the basis of theory” (p. 130). Establishing construct validity can be a tedious process because it requires a gradual accumulation of evidence which supports that scores relate to observable behaviors in a way that they were predicted by an underlying theory. One example of how a psychologist can measure construct validity is when using a test that measures whether participants who have higher intelligence scores will achieve higher grades in school. There are also two types of construct validity which are known as convergent and discriminant. Convergent validity may be measured when final results are similar to a different test that measures the same construct and discriminant validity occurs if the selected test does not measure constructs that it what not intended to measure (Bordens, Abbott. 2008).

Establishing construct validity is also important because it gives psychologists the opportunity to determine whether a test acquires similar measures when compared to similar methods and that it does not measure constructs that it is not intended for. Furthermore, if a testing instrument does not have an adequate level of construct validity then a more valid method should be used. This is because it can help ensure that the results will be more accurate because they do not include invalid measures that were obtained by measuring irrelevant constructs.

How Psychologists Can Ensure Adequate Test Reliability and Validity

Since adequate levels of test reliability and validity are essential to acquire reliable and valid results with fewer errors, there are certain things that psychologists need to do to ensure that these measures exist before testing each participant. The first thing that psychologists should do is gather pertinent personal information about each participant before beginning the testing process. This way, the psychologist will have detailed background information that can be used to choose a test that is based on the individualized needs of each participant. This will also be a great way to reveal any racial, gender, educational and/or cultural background issues that may reduce the level of overall reliability and validity. Two examples of this would be if a psychologist administers a test written in English to a participant who can only read Spanish or if a psychologist administers a test about auto-mechanic skills to a 5 year old child.

A second thing that should be done to ensure adequate levels of reliability and validity is to choose testing instruments that have already been measured for these aspects. This is because if previous research has already established adequate levels of these measures, then there may be a higher chance that these aspects will be measured again. Ensuring previous reliability and validity can also aide psychologists in determining which testing methods might be most appropriate to use with each individual participant and/or specified need.

Furthermore, a third thing that should be done to ensure adequate reliability and validity is to follow all ethical and/or legal standards. This is important because these standards have been created to protect participants from experiencing certain negative psychological and/or physical affects that may have occurred in the past. Therefore, the psychologist will need to address all pertinent standards that may apply throughout the entire duration of each testing process.

One specific ethical standard that may apply when using testing methods with participants is confidentiality. This is because it helps protect the rights of all participants by mandating that personal information can only be released under specific circumstances. Following this law is also important because it helps ensure that no harm occurs due to personal information being released in a malicious or damaging manner to third party members. However, the Behavior Analyst Certification Board (2004) has determined that a professional can disclose confidential information when it is mandated by law or for a valid purpose. Some examples of this are if a professional needs to provide service for an individual or organization, acquire payment for services that were previously rendered or if a client is considered a danger to himself or others.

A second ethical standard that may apply when conducting assessment testing is cross-cultured sensitivity. This is because it states that professionals must be aware of their own potential biases when administering, selecting, and interpreting results as well as acknowledgment of potential effects due to differences in age, cultural background, ethnicity, disability, gender, religion, socioeconomic status, and sexual orientation. One example of this would be if a psychologist refuses to work with a participant from a foreign country.

A third ethical standard that may apply when using certain testing methods is informed consent. This is important because it states that professionals must acquire permission prior to assessing any participant. If the participant is a minor, a parent or caretaker must give consent before any testing can occur. This can also be addressed by ensuring that all pertinent consent forms are collected prior to beginning the overall testing process.

A fourth ethical standard that normally applies when using test methods is protection from harm. This is because it ensures that no psychological or physical harm will occur to research participants. Therefore, psychologists will need to determine the safest possible way to use a specific testing method and if no method is available, the test cannot be completed (Schacter, Gilbert, & Wegner. 2009). This can also be implemented by identifying any aspects of testing that may be harmful to one or more participants. Once these factors are identified, the professional must then take precautions to prevent this possible harm from ever occurring.

Finally, a fifth ethical standard that should be addressed prior to using most testing methods is test administration. This states that tests should be administered according to how they were established and any altercations should be noted and/or adjusted accordingly. This is also important because it can ensure that the results will reflect measurements for a specific construct and/or domain. When this occurs, it may also be easier to measure adequate levels of reliability and validity for the specific testing method that is used (Schacter, Gilbert, & Wegner. 2009).

Why It’s Important to Ensure Validity in Different Types of Settings

Once the validity of a test has been established, a psychologist will need to ensure that it can be measured when assessing participants in all various types of settings. This is because the level of test validity can change based on varying factors within different settings. Therefore, certain steps may also need to be completed to ensure that an adequate level of validity can be measured, no matter which factors or setting is used.

One specific setting that utilizes psychological testing on a normal basis is educational institutions. This occurs because achievement and aptitude tests are regularly used to measure a student’s overall level of knowledge about specific topics or aptitude that is needed to master material within a certain domain. A specific test instrument that is also widely used to assess these constructs is called the Scholastic Aptitude Test (SAT) because institutions of higher education can use the scores to make student admissions decisions (CollegeBoard. 2013).

Since this test is designed to assess a student’s achievement or aptitude for future success it’s also important that all types and/or forms of validity are present. This is because face validity indicates that the test appears to measure achievement and/or aptitude at a level that is acceptable to continue further research and design. Content validity indicates that the test does actually measure achievement and aptitude at a level that is acceptable. Criterion validity indicates that the test produces similar results when compared to valid established instruments that measure achievement and aptitude. While, construct validity indicates that there were similar measures when compared to valid achievement and aptitude tests and that the test does not measure irrelevant constructs that can negatively affect the overall level of validity.

A second setting that utilizes psychological testing on a normal basis is mental health clinics. This is because various test methods are regularly used to better understand individual style or aide in clinical diagnoses. One specific paper–and-pencil method that is widely used to assess personality is called the Minnesota Multiphasic Personality Inventory or MMPI. However, this test is not perfect and has raised questions concerning adequate measures of reliability and validity. Therefore, a new revised version called the MMPI-2 can now be used which contains 567 test items with scales that measure even more traits that are associated with abnormal behavior (Lezak, Howieson, & Loring. 2004).

Since testing methods like the MMPI and MMPI-2 are designed to assess individual personality for research and diagnostic purposes, it is also important that all forms of validity are present. This is because face validity indicates that the test appears to measure accurate levels of individual personality at a level that is acceptable to continue further research and design. Content validity indicates that the test actually does measure personality at a level that is acceptable. Criterion-related validity indicates that the test produces similar results when compared to valid established instruments that measure personality. While, construct validity indicates that there are similar measures when compared to valid personality tests and it does not measure irrelevant constructs. Lezak, Howieson, & Loring (2004) also states that the MMPI-2 appears to establish a higher degree of construct validity because it supports more evidence of convergent and discriminant validation.

Summary

For several years, psychological testing has been conducted because it gives psychologists the opportunity to measure specific mental and/or behavioral characteristics in human beings. When this process is going to be completed, it is always important to try and choose a test method that will measure accurate levels of reliability and validity. This is because the final data may be considered more valuable and/or viable by other professionals within the field.

With these things in mind, my ultimate goal for this paper was to provide a better understanding of reliability and validity by addressing certain aspects that are directly associated with these concepts. This included first discussing what the reliability of a test is, why it is important, and how the two main types which are test-retest reliability and internal consistency reliability can be measured. I then addressed what the validity of a test is, why it is important, and how the four main forms which include face validity, content validity, criterion-related validity and construct validity can also be measured.

This was followed by identifying certain things that psychologists need to do to ensure that selected test methods will measure adequate levels of reliability and validity. Three of these include gathering personal information about each participant, choosing well-established methods, and addressing all ethical, legal, individual and socio-cultural issues apply. Some specific ethical considerations that were also addressed in this section include confidentiality, cross cultured sensitivity, informed consent, protection from harm, and test administration.

Furthermore, the final thing that I discussed in this paper is why all four forms of validity must be considered when using test methods in different types of settings. This is because the validity of a measure can change due to various factors within each setting. I am also confident that if this information is followed by psychologists, then it may be easier to confirm both reliability and validity when using test methods within all types of settings.

References:

Behavior Analyst Certification Board (2004). Guidelines for responsible conduct for behavior analysts. Retrieved via Kaplan Online Campus at http://contentasc.kaplan.edu.edgesuite.net/PS502_1004A/images/product/Guidelines%20for%20Responsible%20Conduct.pdf

Bordens, K., & Abbott, B. (2008). Research design and methods. (7^th ed.). New York, NY: The McGraw-Hill Companies, Inc.

Cohen, R. J., & Swerdlik, M. E. (2010). Psychological testing and assessment: An introduction to tests and measurement. Boston, MA: McGraw-Hill Higher Education.

CollegeBoard.Org. (2012). SAT Validity Studies. Retrieved via the World Wide Web at http://professionals.collegeboard.com/data-reports-research/sat/validity-studies

Neukrug, E. S., & Fawcett, R. C. (2010). Essentials of testing and assessment: A practical guide for counselors, social workers, and psychologists. (2^nd ed.). Belmont, CA: Brooks/Cole Cengage Learning.

Lezak, M., Howieson, D., & Loring, D. (2004). Neuropsychological assessment (4^th ed.). Oxford: Oxford University Press.

Zechmeister, J. S., Zechmeister, E. B., & Shaughnessy, J. J. (2001). Essentials of research methods in psychology. New York, NY: The McGraw-Hill Companies, Inc.

UNDERSTANDING INFORMAL AND FORMAL CONSENT METHODS

There are many times when assessment tests will be administered in order to measure certain mental and/or behavioral characteristics of a client. Two specific techniques that can be used by a clinician during this process are known as informal and formal assessment. Choosing which to use may also be based on the information that is needed, how much time is available, and funding.

There are also specific types of informal and formal methods that clinicians may use more frequently when conducting an assessment. Since this is the case, my ultimate goal for this work will be to provide a better understanding of these concepts by first discussing three specific types of informal methods that may be used more often than others. These include observation, records and personal documents, and performance based techniques. In order to explain how informal methods can affect the assessment process, I will also address some of the strengths and weaknesses that are associated with these methods when used in different settings.

The next section of this work will address formal assessment methods by first discussing three specific types which include the Wechsler Intelligence Scale for Children (WISC), Wechsler Adult Intelligence Scale Fourth Edition (WAIS-IV), and Wechsler Individual Achievement Test (WIAT). In order to better explain how these methods can affect the assessment process, I will also provide information about their overall purpose, the specific population that each is intended for and previous research that addresses reliability and validity measures. Furthermore, I will discuss some additional factors that should be considered which include the participant’s educational background, ethnicity, socio-economic status, and ensuring that all ethical and legal obligations are considered and applied during the overall process.

Informal Assessments Methods

Informal assessment methods are subjective and there are many times when they may be designed to meet the specific needs of a clinician. This is because it gives the clinician an opportunity to measure certain individual performance with casual techniques versus using methods that may require high participant involvement. Once, these measures are obtained the clinician can then implement performance objectives that may improve the overall behavior that was observed. Since, these methods are often developed to meet specific assessment needs, they will also normally require less time, money and expertise than nationally developed techniques (Neukrug & Fawcett. 2010).

One specific informal assessment method that may be frequently used by a clinician is observation. This is a popular method within the psychology field because observers have an opportunity to casually observe the “behaviors of an individual in order to develop a deeper understanding of one or more specific behaviors.” When this method is used, the observer will normally also conduct time sampling, event sampling, and/or event and time sampling during the overall process. Time sampling is when behaviors are observed during a limited and set amount of time, while, event sampling is the observation of a targeted behavior with no regard for time. However, there are also occasions when a combination of event and time sampling will be conducted to observe behavior/s for a set amount of time (Neukrug & Fawcett. 2010. p. 308).

A second informal method that may be frequently used by a clinician is records and personal documents. This is because it gives the clinician an opportunity to assess an individual’s behaviors, beliefs, and values by examining items such as diaries, autobiographies, genograms, school records, biographical inventories, or personal journals. This information can also be extremely useful because it may give the clinician a better understanding of the clients overall personal views and/or thought processes (Neukrug & Fawcett. 2010).

A third informal method that may be used frequently by a clinician is called performance based assessment. This gives the clinician an opportunity to evaluate an individual by using various informal assessment procedures that are normally based on real world responsibilities. One particular type of performance based assessment that is widely used in several different settings is portfolios. This is a collection of an individual’s work that can be acquired over time and covers specific areas of content and performance (Neukrug & Fawcett. 2010).

Strengths and Weaknesses of Informal Assessment Methods

There are also strengths and weaknesses associated with informal methods like observation, records and personal documents, and performance based assessments when they are used in different settings. For example, when these methods are used in private practice and educational facilities, the clinician will be able to measure the current and progressive skills or abilities of a client over time. While, a second strength of informal methods within these settings is that the scores can be added to standardized tests that lack crucial information about the client. This is important because a client’s behavior must be accurately measured before any beneficial treatment or intervention can occur. Furthermore, a third strength of using informal methods within these settings is that they may be less intrusive (Neukrug, E.S & Fawcett, R.C. (2010).

Even though these strengths exist, there are certain weaknesses associated with informal methods when used in different settings. For example, if these are used in private practice and educational facilities, there may be more cross-cultural issues and inadequate levels of reliability and validity. This is because vital factors like these may not be addressed with informal methods as extensively as well established formal methods (Neukrug, E.S & Fawcett, R.C. (2010).

Formal Assessment Methods

Formal assessment methods are considered to be more objective and they can be used in clinics, schools, private practices and residential treatment facilities in conjunction with other measures to aide with eligibility issues, diagnosis, educational placement, and decisions regarding intervention processes. Normally, formal assessment methods get used to acquire evidence that supports conclusions that are made from the test. One example of this could be if a clinician uses this method to confirm that a client’s reading ability is below average. This could also be accomplished because there would be visible evidence to support the fact that the clients scores fell in a below average range for that particular age group.

Many people also refer to formal methods as standardized measures because the collected data are mathematically computed and summarized using percentiles, standard scores, or stanines. Since, this process is completed, these methods are also used more frequently in research and publishing to aide fellow professionals and students within the field. This is important to consider because it supports the idea that formal methods may be more test-worthy, reliable and valid than informal techniques (Cohen & Swerdlik. 2010).

One specific formal assessment method that may be used frequently is the Wechsler Intelligence Scale for Children (WISC/WISC-IV). The purpose of this 15 subtest is to measure overall intelligence by analyzing a client’s ability [to acquire and apply knowledge, reason logically, plan-effectively, infer perceptively, make sound judgments and solve problems, grasp and visualize concepts, pay attention, be intuitive, find the right words and thoughts with facility, and cope with, adjust to, and make the most of new situations]. This can also be accomplished by acquiring 5 composite scores that represent an individual’s Verbal IQ (VIQ), Performance IQ (PIQ), Processing Speed Index (PSI), Working Memory Index (WMI) and Full Scale IQ (FSIQ). The population that this test is designed for also includes children between 7 and 16 years of age, and it takes 65–80 minutes to complete (Cohen & Swerdlik 2010. p. 277).

There are also previous studies that have been conducted to examine this methods overall level of reliability and validity. One major study consisted of a standardization sample of 2,200 children who were between the ages of 6 and 16 years and within special group samples. The results indicated that adequate levels of reliability were present. Equivalency studies also supported evidence of convergent and discriminant validity when comparing the results to those acquired with similar methods. Furthermore, evidence of construct validity was also present after conducting numerous confirmatory factor-analytic and exploratory studies, along with mean comparisons when using matched samples of children (Cohen & Swerdlik 2010).

A second formal assessment method that may be used frequently is the Wechsler Adult Intelligence Scale (WAIS-IV) and it “is the latest version in a long line of Wechsler products dating back to the Wechsler–Bellevue Intelligence Scale.” It also consists of 15 subtests like the Wechsler Intelligence Scale for Children, but is designed to measure intelligence with those who are 16 to 90 years of age versus 7 to 16 years of age. (Benson, Hulac & Kranzler. 2010. p. 121).

One particular study measured the overall level of reliability and validity of this by using a standardized sample of age appropriate participants over a two to twelve week period. The results indicated that this method establishes a fairly high level of internal consistency, with test-retest scores ranging from 0.70 (7 subscales) to 0.90 (2 subscales). When examining inter-scorer reliability the coefficients were also very high with scores above 0.90. Furthermore, this study also acquired a score of 0.88 when correlated with a similar method known as the Stanford-Binet IV (Benson, Hulac & Kranzler. 2010).

A third formal assessment method that may be used frequently is the Wechsler Individual Achievement Test Second Edition (WIAT-II). The purpose of this test is to assess a client’s level of achievement by measuring skills like writing, spelling, reading and mathematics. The population that this test is designed for includes clients between 4 and 85 years of age and it can “be administered by psychologists, educational diagnosticians, special education teachers, and anyone else trained in the administration of individual tests” (Treloar. 1994. p. 1).

There are also three types of reliability that have been regularly measured by using a standardized sample of participants. The first type is internal consistency reliability which is the “consistency of an item within a measure-that is, how consistently all the items measure the same construct” (Zechmeister, Zechmeister & Shaughnessy. 2001. p. 119). The average reliability co-efficients for this test are also generally high and range from .80 to .98.

The second type that is regularly measured is test-retest reliability. This is the consistency of individual responses over a period of time and previous results indicate that the average stability co-efficients are high and range from .85 to .98. When measuring interscorer reliability which is the degree of overall agreement between the scorers, the results often range from .94 to .98 with an overall reliability score of .94. Furthermore, when assessing overall validity, corresponding subtests from the WIAT and WIAT-II are strongly correlated with a score above .80 (Zechmeister, Zechmeister & Shaughnessy. 2001).

When conducting a formal assessment, the overall purpose, intended population, and levels of overall reliability and validity are also critical components to consider prior to beginning the overall process. Two reasons for this are because if a test is administered to measure a construct other than what it is intended to measure, or administered to those who are not in the intended population, any results may be considered less reliable and valid. When this occurs, the client may also not receive the proper techniques and/or services that are needed to successfully address specific target behavior/s and therefore, the current level will likely remain unchanged (Zechmeister, Zechmeister & Shaughnessy. 2001).

Additional Factors to Consider When Using Formal Assessment Methods

When conducting assessments, the clinician normally addresses a variety of questions which are often based on an open awareness of their client’s individualized needs and overall psychological state. However, in order to complete this process in the best manner possible, the clinician must address all factors that can affect the results. Some factors can also cause issues throughout all stages of the process, so it is crucial to address these immediately upon occurrence (Lezak, Howieson, & Loring. 2004).

One specific factor that can affect the results is varying educational backgrounds among clients. This is important because if the clinician administers a test that a client cannot comprehend due to a lack of education, the results will not be as reliable and/or valid. On the same note, the clinician must also be trained to properly read and interpret the results of the test. This is important because if a clinician cannot complete this step due to a lack of comprehension, then there may be reliability or validity concerns, and it could be a waste of the time, money and resources that are needed to complete a formal assessment (Lezak, Howieson, & Loring. 2004).

A second factor that can affect the results of an assessment is the ethnicity of each client. For example, if a client can only read Spanish but is given a test that is written in English then the results will not reflect true measures. Furthermore, there may also be times when a clinician possesses certain bias or prejudice feelings toward one or more participants and the results won’t be accurate because scores could be acquired or “fudged” based on this negative way of thinking (Lezak, Howieson, & Loring. 2004).

A third factor that can affect the results of a formal assessment is the socio-economic status of each client. One reason for this is because when adults and children come from a low income household, they may not be able to pay for formal methods, so less costly ones will be used. Many times, these methods can lack the crucial information that is needed to offer a proper diagnosis or provide a specific service. Furthermore, previous research has also indicated that children who live in low-income neighborhoods may experience higher levels of abnormal motor development, malnutrition, and/or emotional instability due to lack in parental knowledge or not having access to needed services. Therefore, it will be crucial to address these issues before using specific methods so the results will be more accurate (Lezak, Howieson, & Loring. 2004).

Ethical Codes That Could Apply When Using Formal Assessment Methods

The American Psychological Association (APA) also created a set of ethical standards that must be applied during the entire duration of most assessment processes. This is because following these standards can prevent unethical or harmful treatment from occurring.

One specific ethical standard that may apply when using testing methods is confidentiality. This is because it helps protect the rights of all participants by mandating that personal information can only be released under specific circumstances. Following this law is also important because it helps ensure that no harm occurs due to personal information being released in a malicious or damaging manner to third party members. However, the Behavior Analyst Certification Board (2004) has determined that a professional can disclose confidential information when it is mandated by law or for a valid purpose. Some examples of this are if a professional needs to provide service for an individual or organization, acquire payment for services that were previously rendered or if a client is considered a danger to himself or others.

A second standard that may apply when using formal methods is cross-cultured sensitivity. This is because it states that psychologists must be aware of their potential biases when administering, selecting, and interpreting results as well as acknowledgment of potential effects due to differences in age, cultural background, ethnicity, disability, gender, religion, socioeconomic status, and sexual orientation (Behavior Analyst Certification Board. 2004). One example of this would be if a psychologist refuses to test a participant from a foreign country.

A third ethical standard that may apply when using certain testing methods is informed consent. This is important because it states that professionals must acquire permission prior to assessing any participant. If the participant is a minor, a parent or caretaker must give consent before any testing can occur (Schacter, Gilbert, & Wegner. 2009). This can also be addressed by ensuring that all pertinent consent forms are collected prior to beginning the overall process.

A fourth ethical standard that usually applies when using test methods is protection from harm. This is because it ensures that no psychological or physical harm will occur to research participants. Therefore, psychologists will need to determine the safest possible way to use a specific testing method and if no method is available, the test cannot be completed (Schacter, Gilbert, & Wegner. 2009). This can also be implemented by identifying any aspects of testing that may be harmful to one or more participants. Once these factors are identified, the professional must then take precautions to prevent this possible harm from ever occurring.

Furthermore, a fifth ethical standard that should be addressed prior to using most testing methods is test administration. This states that tests should be administered according to how they were established and any altercations must be noted and/or adjusted accordingly. This is also important because it can ensure that the results will reflect measurements for a specific construct and/or domain. Therefore, it may also be easier to measure adequate levels of reliability and validity for the specific method that is used (Schacter, Gilbert, & Wegner. 2009).

Summary

There are many different settings where assessment tests are administered in order to measure possible mental and/or behavioral characteristics of a client. Two common methods that can be used by a clinician to acquire data are informal and formal assessments. A clinician may also choose which method is best by determining the specific information that needs to be acquired, the time that is available for data collection, and whether proper funding is available.

There are also specific types of informal and formal methods that clinicians may use more frequently when conducting an assessment. Therefore, my main goal for this work was to first provide a better understanding of informal methods by discussing three specific types which include observation, records and personal documents, and performance based techniques. In order to better explain how these methods can affect the assessment process, I also discussed the strengths and weaknesses that may be associated with each, when used in specific settings.

I then addressed what formal assessment methods are by also discussing three specific types which include the Wechsler Intelligence Scale for Children (WISC), Wechsler Adult Intelligence Scale Fourth Edition (WAIS-IV), and Wechsler Individual Achievement Test (WIAT). In order to better explain how these methods can affect the assessment process, I also provided information about their overall purpose, the specific population that each is intended for and previous research about reliability and validity measures. Finally, I addressed certain additional factors that should always be considered when using formal methods. Some of which include the participant’s educational background, ethnicity, socio-economic status, along with ensuring that all ethical and legal obligations are considered and applied during the overall process.

References:

Benson, N., Hulac, D. M., Kranzler, J. H. (2010). Independent examination of the Wechsler Adult Intelligence Scale—Fourth Edition (WAIS–IV): What does the WAIS–IV measure? Retrieved via the Kaplan Library at http://ehis.ebscohost.com.lib.kaplan.edu/eds/pdfviewer/pdfviewer?sid=a11e34cf-734f-4245-91bc-dc24fe1e6478%40sessionmgr4&vid=9&hid=6

Cohen, R. J., & Swerdlik, M. E. (2010). Psychological testing and assessment: An introduction to tests and measurement. Boston, MA: McGraw-Hill Higher Education.

Lezak, M., Howieson, D., Loring, D. (2004). Neuropsychological assessment (4^th ed.). Oxford: Oxford University Press.

Schacter, D., Gilbert, D., Wegner, D. (2009). Psychology. New York, NY: Worth Publishers.

Treloar, J. M. (1994). Wechsler Individual Achievement Test (WIAT). Intervention in school & clinic. Sage Publications Inc. Retrieved via the Kaplan Library at http://ehis.ebscohost.com.lib.kaplan.edu/eds/detail?vid=4&sid=eae9e6c0-8d36-4912-ac6e99dc6a3699c1%40sessionmgr104&hid=110&bdata=JnNpdGU9ZWRzLWxpdmU%3d#db=f5h&AN=9602291490

Zechmeister, J. S., Zechmeister, E. B., & Shaughnessy, J. J. (2001). Essentials of research methods in psychology. New York, NY: The McGraw-Hill Companies, Inc.

BULLYING IN SCHOOL

In today’s society, one reason that some “researchers get drawn in to the enterprise of developmental psychology is that they are captivated by and want to understand the fascinating, complex, and often times surprising array of behaviors children display” (Bukatko, 2008. p. 40). One particular behavior that captured my interest during this degree program is to determine which adolescents are being psychologically and/or physically bullied while in school. This is because previous research has indicated that bullying among this age group has become so serious that nearly one third of the overall population has reported being victimized. If this issue is not addressed in a positive manner then more children among this age group may become victims of this growing social phenomenon (D'Esposito, Blake & Riccio. 2011).

Since, this issue seems to be occurring at a growing rate, I would like to conduct my own study to identify which students are being bullied. The reason for this is because I suspect that several students are being bullied while in school. This study will also be useful because I can report any pertinent findings to the teacher/s, principal and other pertinent staff members for further investigation. These professionals may then use this information to establish programs and/or procedures that could prevent this type of bullying from occurring. This is also crucial because if this assistance is successfully implemented then the number of overall bullying incidents among this age group may be reduced or eliminated.

With this in mind, the overall purpose of this paper will also be to discuss the method that I would use to conduct my own study, the intended population and time frame, how to analyze and interpret the data, ensure adequate levels of reliability and validity, some ethical, legal, individual and socio-cultural concerns that could arise, why this research could be important, and some limitations that may occur when using the selected approach in a specific setting.

Method

In order to successfully complete my study and provide valuable information to staff members, I will use a qualitative research approach because it can determine the who, what, and when that is associated with a specific topic of study. This is different than an alternate approach known as quantitative research because that would be used if I want to determine the why and how that is associated with a certain topic of study (Patton. 2001).

Those who support the qualitative research approach also believe that the collected data is detailed, contextual, sensitive and nuanced compared to that of the quantitative method. This is because it can produce data which has greater breadth and depth and there are multiple methods that can be used to address sensitive subjects like how students are being bullied (Patton. 2001). Using this approach may also allow me the opportunity to acquire a better understanding of this social phenomenon based on viewpoints from those who have been directly affected.

When using this approach the first technique that I will use to collect data is document analysis. This is because it can be used when a researcher wants to collect [written materials and other documents from organizational, clinical, or program records; memoranda and correspondence; official publications and reports; personal diaries, letters, artistic works, photographs, and memorabilia; and written response to open-ended surveys] This means that I will be able to acquire previously written materials that directly relate to the topic under study (Patton. 2001. p. 4).

For this particular study, I will also use this technique to first acquire information that identifies which students have previously reported being bullied. This may be useful because it will give me specific participants to begin my study with who have also had firsthand experience with the topic of study. I will also attempt to collect this data through the guidance office, human resources department and/or school psychologist.

Once this is completed, the second thing that I will do to acquire even further data is use a method known as unstructured participant observation. This can also be referred to as naturalistic observation and it means that I will be able to observe participant behavior in a natural setting, without any type of intervention (Zechmeister, Zechmeister & Shaughnessy. 2001). The purpose of this will be to determine if I observe bullying behavior that occurs to students who have previously reported being bullied while in school. One example of this could be that I observe a female student get kicked repeatedly under the desk by another student but doesn’t tell the teacher. While, a second example could be that I observe a student being bullied online while in technology class, and she has already reported this to school staff in the past.

Even though document analysis and participant observation may be enough to determine that certain students are being psychologically and/or physically bullied, I may need to use a third method known as unstructured interviews. This means that I will be able to ask open-ended questions that could initiate new leads or further details which identify those students who are being bullied in school. This data could also prove to be extremely valuable because it will come directly from students who have experienced being bullied in the past. Two advantages of this process are that I may be able to build a better rapport with the students and obtain more in-depth information. However, two disadvantages of using unstructured interviews are that I may miss important information due to getting caught up in a participant’s story or too much time is spent on one specific topic (Neukrug, Fawcett. 2010).

Furthermore, one other qualitative approach that I may need to use during this specific inquiry is case studies. This is because using this technique can be a valuable way to acquire specific information that is needed for individualized program evaluation. This means that specific programs may be implemented to meet the needs of an individual student. Completing this process, may also be beneficial if the data that I initially acquire from document analysis, participant observation and unstructured interviews, appears to measure errors, has inadequate levels of reliability and validity or can’t provide specific data that is needed for these programs. However, even though this may be true, using case studies and interviews can also take longer, have higher levels of cost and be more intrusive to participants, when compared to observation and document analysis (Neukrug, Fawcett. 2010).

Sample of Participants and Duration

The sample of participants for this study will include approximately 150 adolescent students who have previously reported being bullied to the guidance office, human resources department or school psychologist. The age of participants will range between 13 and 17 years old. The reason that I will choose this particular population is because if I conduct my study with the wrong population, then the results may be considered less reliable and valid. If this occurs, the school may not implement the proper techniques and/or services that are needed to successfully address this issue so the current level of bullying will likely remain unchanged or increase. (Zechmeister, Zechmeister & Shaughnessy. 2001).

The particular setting for this study will be within multiple divisions of the local public school including the cafeteria, individual classroom and gymnasium. The time frame for this study will also include 1 hour daily sessions over a 24 week period. During this time I will use any pertinent documentation that has been acquired through other school divisions, along with all personal notes that are collected while observing each individual student. If interviews are required later in the process, I will also design a topic specific questionnaire that can be used to possibly acquire any data that is still needed or missing.

If case studies are also needed, I will set up weekly one-on-one sessions with the same sample of students who are taking part in this study. This is important because the additional data will be collected throughout the entire duration of the study from the same source of information. When this is completed, there may also be a greater breadth and depth of information about these participants that also identifies students who are actually being bullied while in school.

Once, I obtained all of the data, I would than attempt to analyze and interpret my findings by using the grounded theory. This theory is a good approach to use because I could build a theoretical framework which may expose only those vital and valuable details that truly explain the data and/or phenomenon that is being studied. This would also be completed by reading and reviewing the data, writing notes, and using coding. When conducting qualitative analysis, coding means that I would attempt to identify certain themes within the collected materials that directly relate to the topic of study. These themes could also be identified by observing any common ideas and patterns that are repeatedly present within the written data (Patton. 2001).

Once the analysis was complete, I would then try to interpret the data by attaching significance to any themes and patterns that were observed. This could be accomplished by writing a list of key themes and then considering any alternative explanations that may exist by looking for further differences in responses or observations within the data. Finally, I would draft a report that details my findings. This is an important final step because it will give me an opportunity to make sense of the data through the use of synthesizing and summarizing pertinent information (Patton. 2001). One specific thing that following this step could provide is supporting evidence that identifies certain students that are being bullied while in school.

Furthermore, I would also need to ensure that my final data can provide evidence that supports adequate levels of overall reliability and validity. This is an important step because if these are not present, then my final data may possess more errors and not be considered reliable and/or valid by others within the scientific community. Therefore, I would complete this process by using certain techniques throughout the duration of this study.

The first one would include ensuring that I collect all personal information about each participant that may be pertinent to the study. This way, I will have detailed background information that can be used to make decisions that are based on the individualized needs of each participant (Neukrug, Fawcett. 2010).
Secondly, I would choose qualitative methods that have measured adequate levels of reliability and validity in previous research. This is because if previous research has already established adequate levels of these measures, then there may be an increased chance that these aspects will be measured again. Following this process could also help me identify which methods might be most appropriate to use for my particular study (Bordens, Abbott. 2008).
Furthermore, a third thing that I could do to ensure adequate levels of reliability and validity is to consider all applicable ethical and/or legal standards. This is important because these standards were created to protect participants from experiencing certain psychological and/or physical affects that may have occurred in the past (American Psychological Association (2013). Therefore, I would also be sure to follow these standards during the entire duration of my study.

Ethical, Legal, Individual & Socio-Cultural Concerns That Could Arise

When conducting this study, one major standard that could apply is confidentiality because it protects the rights of a participant by mandating that personal information can only be released under specific circumstances. Following this law is also important because it can ensure that no harm occurs due to personal information being released in a malicious or damaging manner to third party members. However, the Behavior Analyst Certification Board (2004) states that a behavior analyst can disclose confidential information when it is mandated by law or for a valid purpose. Some examples of this are if a professional wants to provide services for a client or organization, to acquire payment for services that have been previously provided or if a client might be a danger to himself and others (p. 4). I will also implement this in to my own study by having each participant sign a written document which states that information will only be provided to the educational system heads, if it is for a valid purpose. One example of a valid purpose that could also occur in my study is if I choose to report part of my findings because a female participant appears to be in danger of harming herself due to being continuously bullied.

A second standard that may apply when conducting this study is cross-cultured sensitivity. This is because it states that psychologists must be aware of their potential biases when administering, selecting, and interpreting results as well as acknowledgment of potential effects due to differences in age, cultural background, ethnicity, disability, gender, religion, socioeconomic status, and sexual orientation (Behavior Analyst Certification Board. 2004). I can implement this by ensuring that no aspects of the study or collected data are based on my own personal biased opinions and thoughts. One example of this could also be if I refuse to allow a certain male student to be part of my study because he is from a foreign country.

A third ethical standard that may apply when conducting this study is informed consent. This is important because it states that professionals must acquire permission prior to assessing each participant. If the participant is a minor, a parent or caretaker must give consent before any testing can occur (Schacter, Gilbert, & Wegner. 2009). This can also be addressed in my own study by ensuring that all consent forms are collected from parents before beginning the process.

A fourth ethical standard that usually applies when conducting a qualitative study is protection from harm. This is because it ensures that no psychological or physical harm will occur to research participants. Therefore, I will need to determine the safest possible way to use this approach and if no safe option is available, then the research cannot be completed (Schacter, Gilbert, & Wegner. 2009). This can also be implemented in my own study by identifying any aspects that may be harmful to one or more participants. Once these factors are identified, I will then take precautions to prevent this harm from occurring. One example of this could also be if I assist a student who is in obvious distress after answering my questions about being bullying.

A fifth standard that may apply when conducting this study is to acquire proper Institutional Review Board (IRB) approval. This means that all researchers must seek prior approval before conducting any research study. Fisher (2009) determined that following this standard is important because it ensures that the (IRB) has approved a research proposal based on a specified protocol (p. 205). I will also follow this standard in my own study by ensuring that the overall research proposal is approved by the local school system and pertinent staff members before beginning.

Furthermore, a sixth standard that may apply when conducting this study is release of qualitative data. This means that the data and/or any findings cannot be released to others unless the participant or parent has signed a release form, the receiving individuals can adequately analyze and interpret the data, and the information will not be misused in any way. This is also important because release of certain information could be damaging to participants and others who are involved in the process. Therefore, I can follow this standard in my own study by ensuring that data will only be shared if a release form has been obtained, the recipient can analyze and interpret the data correctly, and will not misuse it in any way (Fisher. 2009).

The Importance of This Study

When conducting this study, one major benefit is that it may add to existing literature that addresses students who are being bullied while in school. This is important because since this seems to be a major issue within our current school systems, it is essential to produce as much valuable research as possible. This information can then be used to offer a better understanding of this social phenomenon to aide in prevention efforts.

One example of a new and original thing that could also be discovered with this study is the identification of specific ways that students are being psychologically and physically bullied while in school. This is important because the information can then be used in combination with previous research to prevent this specific bullying behavior from occurring. If this behavior is addressed appropriately then students may also not have to experience certain psychological conditions, like anxiety and depression that can occur after experiencing such a social trauma (D'Esposito, Blake & Riccio. 2011). If this information does aide in a reduction and/or elimination of bullying incidents among adolescents, then it could change the entire face of advancing psychological science and/or research concerning this topic.

Possible Limitations of Using This Approach

Even though conducting a qualitative inquiry may identify which students are being bullied while in school, there may be certain limitations associated with using this research design in an educational setting. One specific limitation is that the collected data may be lacking due to weaknesses that are naturally associated with using a qualitative design approach. Some of these are that the results can’t be generalized to the overall school population because data will be gathered from fewer participants, it might be difficult to analyze, lack consistency and reliability, be time consuming and costly if several methods are used (Patton. 2001).

A second limitation of using a qualitative design approach in an educational setting is that the overall process may not be completed as initially expected. One reason that this can occur is because there might not be a universal agreement among all people involved due to varying opinions, emotions, feelings, ideas and solutions (Patton. 2001). One example of this is if I conduct my inquiry to determine which students are being bullied but the schools evaluation team and action researchers can’t choose what programs or procedures to implement. If these professionals are unable to choose effective programs and/or processes, then the number of students who experience bullying behavior may also remain unchanged or increase. If this occurs, then some people who are involved with the study may determine that this particular inquiry was a big waste of time and money because it did not improve anything or address the real problem.

Summary

In today’s society, some researchers are drawn in to the field of developmental psychology because they want to learn and understand the different behaviors that children exhibit (Bukatko, 2008). One particular behavior that captured my interest while obtaining a master’s degree is that adolescents are being psychologically and/or physically bullied while in school. Previous research has also indicated that bullying among this age group is so severe that nearly one third of the overall population has reported being victimized. If this issue is not addressed then more children within this age group may also become victims of this growing social phenomenon (D'Esposito, Blake & Riccio. 2011).

Since, this issue seems to be occurring at a growing rate, I would like to conduct my own study to possibly identify which students are being bullied. Therefore, the overall purpose of this paper was to discuss the method that I will use to conduct my study, the intended population and time frame, how to analyze and interpret the data, ensure adequate levels of reliability and validity, some ethical, legal, individual and socio-cultural concerns that could arise, why this research may be important, and some limitations that might occur when conducting a qualitative inquiry within an educational setting. If this overall process is followed, I am also confident that I will be able to determine which students are being bullied while in school. This will be beneficial because the findings can be used by teacher/s, principal and other pertinent staff members for further investigation and implementation of prevention programs. This assistance is also crucial because if it is successfully implemented, then the number of overall bullying incidents among adolescents and other school age children may be reduced and/or possibly even eliminated.

References:

American Psychological Association (2013). Guidelines for providers of psychological services to ethnic, linguistic, and culturally diverse populations. Retrieved via the World Wide Web at http://www.apa.org/pi/oema/resources/policy/provider-guidelines.aspx

Behavior Analyst Certification Board (2004). Guidelines for responsible conduct for behavior analysts. Retrieved via Kaplan Library.

Bukatko, D. (2008) Child and adolescent development, a chronological approach. Ohio; Cengage Learning.

D'Esposito, S. E., Blake, J., Riccio, C. A. (2011). Adolescents' vulnerability to peer victimization: Interpersonal and intrapersonal predictors. Retrieved via the Kaplan Library.

Fisher, C. B. (2009). Decoding the ethics code (2nd ed). Thousand Oaks, CA: Sage Publication.

Neukrug, E. S., & Fawcett, R. C. (2010). Essentials of testing and assessment: A practical guide for counselors, social workers, and psychologists. (2nd ed.). Belmont, CA: Brooks/Cole Cengage Learning.

Patton, M.Q. (2001). Qualitative research & evaluation methods. Thousand Oaks, CA: Sage Publications.

Schacter, D., Gilbert, D., Wegner, D. (2009). Psychology. New York, NY: Worth Publishers

Zechmeister, J. S., Zechmeister, E. B., & Shaughnessy, J. J. (2001). Essentials of research methods in psychology. New York, NY: The McGraw-Hill Companies, Inc.