Transfer learning has benefited many real-world applications where labeled data are abundant in source domains but scarce in the target domain. As there are usually multiple relevant domains where knowledge can be transferred, multiple source transfer learning (MSTL) has recently attracted much attention. However, we are facing two major challenges when applying MSTL. First, without knowledge about the difference between source and target domains, negative transfer occurs when knowledge is transferred from highly irrelevant sources. Second, existence of imbalanced distributions in classes, where examples in one class dominate, can lead to improper judgement on the source domains' relevance to the target task. Since existing MSTL methods are usually designed to transfer from relevant sources with balanced distributions, they will fail in applications where these two challenges persist. In this article, we propose a novel two-phase framework to effectively transfer knowledge from multiple sources even when there exists irrelevant sources and imbalanced class distributions. First, an effective supervised local weight scheme is proposed to assign a proper weight to each source domain's classifier based on its ability of predicting accurately on each local region of the target domain. The second phase then learns a classifier for the target domain by solving an optimization problem which concerns both training error minimization and consistency with weighted predictions gained from source domains. A theoretical analysis shows that as the number of source domains increases, the probability that the proposed approach has an error greater than a bound is becoming exponentially small. We further extend the proposed approach to an online processing scenario to conduct transfer learning on continuously arriving data. Extensive experiments on disease prediction, spam filtering and intrusion detection datasets demonstrate that: (i) the proposed two-phase approach outperforms existing MSTL approaches due to its ability of tackling negative transfer and imbalanced distribution challenges, and (ii) the proposed online approach achieves comparable performance to the offline scheme.