ContentIn order to determine the effect of curators on the trustworthiness of recommendation systems, this paper takes the users’ willingness to accept the recommendations that are given by the RSs as the dependent variable, and uses the perceived trust and transparency of RS’s recommending procedure by users as the independent variables. This study ran with the help of an experiment sketch via an online survey with 3x2 between-subjects design to test these variables and verify hypotheses. To facilitate subsequent data analysis, the questionnaire used randomized controlled trial(Chalmers at el., 1981), participants were allocated to different comparison groups. In the section, the research methods used and its launching procedure are being described in depth. Firstly, the data collection strategy and sample strategies are outlined with short explanations of why they have been used. Followed by the measure used to gather data to illustrate what is being measured and how it has been measured.
Data collection and Sampling strategy
This thesis intends to conduct field research in form of questionnaires since an online experimental design provides the ease of accessing the participants and low implementation difficulty among the limited time and conditional constraints. The first step to collect primary data is to determine the sampling method. The participants were selected by using a nonprobability convenience sampling approach with the central limit theorem. Therefore, at least 150 participants are needed for completing this research.
The next step is to design the questionnaire. In addition to the introductory text, the first part of the questionnaire is a given scenario to the subjects. The respondents got informed that they are considering to buy a new mobile and looking for a trustworthy purchase option through a recommendation platform. Respondents randomly received one of three recommendations prototypes offered by two kinds of aforementioned systems and in the curator system, this paper examines two different types of curator, which are named Fashnetic and Devicnetic. Devicnetic gives a recommendation based on her/his expertise, and Fashnetic is based on her/his fame. The assigned recommendations were described as the platform's provided solution to users (respondents), which represented in video form created by Adobe Photoshop. Afterward in the survey asked participants their intention to accept the offered recommendations.
In the second part, this paper also used the between-subjects method for the moderator. Therefore, apart from the recommendations deduced by different RSs, the respondents were also randomly allocated to two groups. These two groups received the same survey arrangement only with slight differences on users’ perceived transparency of the recommending procedure, thereby producing a measurable variance in perceived transparency and further study the moderating effects of transparency on recommendations accepting intentions. Half of the respondents would perceive an intimation of relatively high transparency on recommendation procedure, such as how were the recommendations made by the recommendation system. Another half would conversely perceive relatively low transparency.
Thirdly, to test whether the mediating role of trust exists, this research combined and streamlined the trust testing methods that have been proven in previous papers. As mentioned in the literature review, this paper only focuses on users’ trusting intention, since trusting beliefs will lead to users' trusting intentions. Especially due to the consideration that lengthy questionnaires may reduce the attention of respondents and leads to dismissive answers. In the end, several demographic questions were raised, namely gender, age, and education level.
Additionally, with the aim of minimizing external influences on the results, many control variables are used in the design of the questionnaire. First of all, the background set in a single industry, which is the mobile phone industry. This industry was selected because it is relatively unisexual, meaning that males and females tend to have similar demands and interests. Moreover, the curator profiles were self-created, which prevents the respondents’ bias due to their previous knowledge of existing influencers in several aspects, such as the number of subscribers, average views per video, a short background description, etc. Therefore, respondents have asked to rate the curator’s popularity and expertise, and transparency of the recommending procedure to ensure whether the manipulations set successfully.
The trusting intention scales were employed originated from Dobing (1993) and have readopted by McKnight, Choudhury, and Kacmar (2002). The scale consists of 4 items, and all these four items have been proved their validity in previous papers with relatively high factor loadings and reliability (Cronbach’s alpha). Respondents were asked about their willingness to depend on the recommendations given by the Recommendation platform (Example item: “When I need advice for a purchase, I would feel comfortable depending on the recommendations provided by a recommendation platform” ).
Respondents with higher transparency setting received a list of questions that ask about their preferences for mobile phones. Such as more concerning attributes, expected price, etc. In another case, nothing would be displayed before the suggestion given.
It was controlled for five variables to exclude alternative explanations. First is the degree of expertise and popularity of curators, by asking respondents: “Please rate your perception of the popularity and the expertise of the curators who were presenting in videos” in a five-point Likert scale from low as 1 to 5 as high. The second one is respondents’ perception of transparency of the procedure, by asking: “To what extend you think you know how the recommendation system works?” with a five-point Likert scale from fully aware as 1 to fully unaware as 5. Moreover, to prevent the customers from knowing the intent of the test in advance, the question for testing the manipulation has appeared at the end. Gender, age, and education level were included as control variables because they can affect attitudes (Chan, Taylor, & Markham, 2008; Spreitzer G. M., 1995). For instance, Gabriel & Gardner's (1999) results claimed that men tend to be more collectively interdependent, and women are usually more relationally interdependent.
They might have different preferences on popularity and expertise. M. Sutter and M.G. Kocher(2006) found out that trust increases as consumers’ age increases. Besides, except gender was measured in nominal form, both age and education levels were designed in ordinal form. As age was ranged into 6 categories from under 18 to over 55.
This paper conducted a regression to find the extent of the impacts by looking at the path coefficients between the variables. To complete the analysis, preliminary data preparation and descriptive statistics were first required. After these two basic processing, manipulation check, reliability, and validity need to be tested to indicate the quality of the research design. Lastly, the previously established hypotheses are tested and analyzed in multiple stages.
Before the actual test, the raw data has been cleaned and converted to a format that the software could understand and handle. There were originally 160 participants. However, someone rated the curator's expertise and popularity exactly the same value. If the participants read the description of the curator carefully, it is almost impossible to happen. After excluding these participants, 154 participants were left. First of all, the level of recommending procedure transparency has been recorded as a binary variable. Respondents who answered the question on the preference list were deemed to be in a high-level transparency scene and coded as 1, low-level transparency scene as 0. Furthermore, the different recommendations randomly offered to respondents is a nominal variable. As nominal variables only offer plain text as information and do not offer any mathematical value, which is not suitable to use for a regression. Hence, they need to be recoded into dummy variables where one level acts as a baseline for other variables. The offered recommendations consist of three types (Official advertisement, Fashnetic, Devicnetic) where the Fashnetic and Devicnetic have been merged into one variable as both of these recommendations were belong to curator systems(CSs), and the traditional RSs acts as the baseline. In order to make the test operatable, the independent variable has been recoded as a dummy variable, represented as Curator System in tests and tables. When the received recommendation was provided by CSs(Fashnetic & Devicnetic), the value of the dummy is 1. When the received recommendation was provided by general RSs, the value of the dummy is 0.
After the data was cleaned and coded, descriptive statistics have been conducted. The descriptive statistics show that 84 males and 70 females participated in the survey. With regard to age, it can be noticed that 76% of participants are between 18 to 34, and the group between 18-24 years share the highest percentage (46.8%), which due to the distribution channel that a student could have. Due to the same reason, most people also share the same education level. 72% of participants are Bachelor's or Master's students. Besides the demographic attributes, the average willingness of recommendation acceptance is 3.66 (SD = 1.211), above the median (M = 2.5). The average trusting intention is 14.16 (Max = 20, Min = 5), also above the median (M = 12.5). All these data mean that in this digital era, most respondents are still willing to accept the recommendation system to some extent.
Reliability and validity
To ensure that the research design has sufficient quality, factor analysis was used to test the measurement model and established convergence validity of trusting intention. Convergence validity refers to the degree to which tests designed to measure a particular variable are actually measuring the underlying theoretical constructs because they share variance (Schwab, 1980). Reliability refers to the stability and internal consistency of the inevitable results of the measurement tools used. Simultaneously, internal consistency reliability, though usually considered necessary, fails to serve as a sufficient condition for convergent validity (Schwab, 1980).
There are four questions to test trusting intention which are included in the reliability analysis to test the reliability of internal consistency. It can be seen from the results that Trust’s Cronbach’s Alpha is 0.934, which is much greater than the 0.7 standards, so the overall scale has a good level of reliability. It can be seen from the results of Item-Total Statistics that the reliability index is not significantly improved after deleting each item, so it is further verified that its reliability level is good.
Since all four questions was towards trusting intention, it has to make sure that all these questions are belong to the same underlying concept. The principal component extraction method is used to perform factor analysis on Trust. The results showed that the KMO value was
0.856, and reached the significant standard of 0.001. The hypothesis that all the variables are independent is rejected, and the variables are considered to have a strong correlation. Therefore, the above topic is suitable for factor analysis. Further on, the principal components are extracted from the four questions of trust by using the method that the characteristic root is greater than 1. From the results in the table of factor analysis in the appendix, we can see that a total of 1 principal component is extracted from the four questions of trust, and the cumulative variance contribution rate is 83.509%, which means a high degree of information extraction and it fits to the previous design intention. In addition, the scree plot refers to a graph that shows how much information the factors cover in the factor analysis. Generally, it is steep first and the first factor covers the most information and then decreases sequentially, the trend line become flatter. In our scree plot, the trend line has dropped significantly after the first principal component, and the extracted factors can basically cover all the information of the original variable. Therefore, the scree plot supported the extraction of a principal component result.
Use the maximum variance method to rotate the questions of the scale. It can be seen from the results that the factor loading on the principal component of the four questions is between 0.9210.899, which are all higher than the 0.5 standards, so the scale has appropriate construct validity.
The manipulation checks for curators’ expertise and popularity was tested by paired t-test approach, to see whether the means of two paired variables are significantly different. To check the success of perceived transparency manipulation, independent sample t test was adopted.
Using the research method of paired-sample t test, the difference between popularity and expertise of Fashnetic, and the difference between popularity and expertise of Devicnetic were tested. It can be seen from the results that respondents perceived popularity on Fashnetic has a significantly higher score than perceived expertise on Fashnetic(t = 10,221, p < 0.001); and perceived popularity of Devicnetic has a significantly lower score than expertise of Devicnetic (t = 10,221; p < 0.001). It indicates that the manipulation of different types of curator was successful, Fashenetic (popularity based) has been perceived as with high popularity but low expertise and Devicnetic (expertise based) has been perceived as with high expertise but low relatively low expertise.
Perceived transparency manipulation check applied the method of independent sample t-test. Using the high-low-transparency grouping as the grouping standard, the differences in the transparency evaluated by respondents under different levels of transparency were tested. From the results, it can be seen that under the condition of low transparency, the respondents' perceived transparency score is significantly lower than the score of perceived transparency under the condition of high transparency, which is in line with the manipulation purpose.
To test the hypotheses constructed in the conceptual model, this paper has conducted multiple linear regression analyses. This statistical technique analyzes the relationship between independent variables, dependent variables, mediators, and moderators. In the paper, gender, age, and education level are taken as control variables; the types of recommendation systems as independent variables, Trust as the mediating variable; the willingness of accepting the recommendations as the dependent variable, and conducted a three-step for both mediating effect by PRCOSS, moderator effect test by stepwise regression. Lastly, the effect of curator’s attributes on the model, namely expertise and popularity is tested by a simple regression. The results are represented in table 1, 2 and 3.
To find out whether the customers who used CSs would have a higher acceptance rate of its recommendations that has been offered than the general RSs, a regression analysis was conducted. First of all, the result presented in Table 1 confirmed that there is a direct relationship between the types of recommendation system and the user’s willingness of accepting the recommendations offered by RSs. The types of recommendations given by different systems can significantly predict the willingness to accept: curator system (β = 1.0147, p < 0.001), which general recommendation system has been used as a baseline variable (β = 0). Moreover, the types of recommendation system (‘Curator System = 1’) has a positive coefficient, which indicates that getting curators involved in the recommendation system would bring a positive effect on user’s willingness of accepting the recommendations. Therefore hypothesis 1 is supported.
In this research, the mediating effect analysis used a three-step regression method and conducted by the PROCESS plugin in SPSS, which is able to run the test at once, rather than run the regression step-by-step. Same as Baron and Kenny's 3-step approach (1986), PROCESS divides its test procedure into three steps. In the first step, it analyzed the direct relationship between the independent variables ‘Types of Recommendation systems’ to the dependent variable “Willingness of accepting the recommendations” to test whether the hypothesis holds, which has been already confirmed with hypothesis 1. The second step is to analyze the regression of the independent variables “Types of RSs” to the intermediary “Trust”. The third step is to analyze the regression of the independent variables “Types of RSs” on the dependent variable “Willingness of accepting the recommendations” behavior after adding the intermediary variable trust to test whether the hypothesis is valid. If the regression coefficient between the independent variable and the dependent variable after the intermediate variable is added in the third step is less than the regression coefficient between the independent variable and the dependent variable in the first step, it indicates that there is a mediating effect. In PROCESS, the type of mediating role could easily be examined by looking at the direct effect and indirect effect value of the independent variable on the dependent variable. If both effects are significant, the mediating variable plays a partial mediating role. If only the indirect effect is significant, the direct effect is insignificant, the mediating variable plays a fully mediating role.
The mediating effect results of this study are shown in the table above: after controlled the gender, age, and education level, types of recommendation system (Curator system = 1) can positively predict User’s willingness of accepting the recommendations significantly compare to general recommendation system, as general RSs was the baseline (β = 1.0147, p < 0.001). In the model with “Types of RSs” as the independent variable and “Trust” as the dependent variable, types of RSs can significantly positively predict trust (β = 3.5816, p < 0.001). In the model where “Type of RSs” is the independent variable, “Trust” is the mediating variable, and “Users’ willingness to accepting the recommendations offered by recommendation systems” is the dependent variable, types of RSs has no significant predictive effect, but Trust can significantly predict the willingness to accept (β = .2402, p < 0.001). Besides, the direct effect of types of RSs on willingness to accept is insignificant (β = .154, p = 0.3208 > 0.05), and the indirect effect is significant (β = .860, p < 0.001).
Based on the above results, trust has played a significant and full intermediary role between types of recommendation systems and Willingness to accept the recommendations, which supported hypothesis 2 that trust mediates the relationship between willingness to accept the recommendations and the types of system used.
In line with hypothesis 3, an independent regression has been conducted, which makes the result clearer to analysis. The results in Table 2 presents that although the expertise degree of Fashnetic and the popularity degree of Devicnetic did not significantly affect user’s willingness to accept, both the popularity and the expertise of curators are positively correlated with user’s willingness to accept the recommendations offered by CSs(β = .193, p < 0.001; β = .055, p = 0.026; β = .034 ,p = 0.213; β = .438, p < 0.001). Hypotheses 3a and 3b for popularity and expertise of curators are therefore both supported. The reason the insignificant results happened might be the setting of the curator’s characteristics. As Fashenetic set with a high level of popularity and a relatively low level of expertise, and the Devicnetic inversely set with a relatively low level of popularity and a high level of expertise. It brought the fact that user’s willingness of accepting the Fashnetic is more based on the curator’s fame. The user’s willingness of accepting the Devicnetic is more based on the curator’s expertise.
Table 2: Relationship between the curator’s expertise level and popularity level with user’s willingness of accepting the recommendations
To test hypothesis 4, a stepwise regression has been used to test the moderation effect of transparency between trust and users’ willingness to accept recommendations. The first layer included gender, age, and education as control variables, the second layer included trust and transparency as independent variables and moderator variables; the third layer included interaction items of trust and transparency.
As shown in Table 3 above, Model 1 represents the influence of control variables on the user’s willingness of accepting the recommendations. It can be seen from the results that each control variable has no significant predictive effect on user’s willingness to accept. Model 2 shows that Trust and Willingness of accepting the recommendations have a significant positive impact (β =.742, p < 0.001), while the moderator Transparency has no significant impact on Willingness of accepting the recommendations. Finally, the interaction term Trust * Transparency is introduced into the model, It can be seen from model 3 that after introducing the interaction term, the interaction term coefficient is significant (β = -.698, p < 0.001), The above analysis shows that Transparency has a significant moderating effect between Trust and user’s willingness to accept the recommendations. However, the interaction term is negative, which means that higher transparency of the recommending process of the systems does not positively affect the perception of trust on the recommendation systems. Hypothesis 4 has been rejected.